ihe 
meinen 
ets 2 


Gx pais 
UNIOTASTTATIS 


piles 

eal 

Muli 

J \ 

Sia 4) i 
ea? aos Nie ; ’ 
} Tre . Ar a 
wee rae aes 


oa 


a 


Wott atrers yl areraeep haat: ese 


a 
: J2ntM OWA ia is 8 


a or 


 Swiioe to esa | _oMTeemNeh a eee 


pane! sem ai in a 


_ 


Yo (TLVIAS TE ot beanetg vied 2 mate }. 
itis. ‘>. whiqns sigats soulaligns ae: a 
Suv ery wd Bois nee Log xh Feet oy ‘bam: = ae 


“lao aanCgE aT Ho sesksr ott Pbgeton ba’ sa ir 


ure, Bd St ewe te dng selige ee i 7 is 


ae 
Raat hd Yaues'd abouts ty WRIA, Bas auvedt «sie 
v ofa is auoris bw broad ideas so eaig og 


THE UNIVERSITY OF ALBERTA 


ESTIMATION OF QUANTILE IN FINITE POPULATION 


WITH SUPER-POPULATION MODEL 


Dy 


MD. ZAHORUL ISLAM 


A THESIS 
SUBMITTED TO THE FACULTY OF GRADUATE STUDIES AND RESEARCH 
IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE 


OF MASTER OF SCIENCE 


DEPARTMENT OF 


STATISTICS AND APPLIED PROBABILITY 


EDMONTON, ALBERTA 


SPRING, 1983 


ABSTRACT 


A literature survey on various types of super-population models 
and their uses in finite population sampling is given. Topics like 
optimum sampling, balanced samples and randomization in survey sampling 
are discussed in detail. We have proved some properties of univariate 
distribution of quantile of a sample from a finite population. One of 
our main objectives was to explore the possible use of auxiliary 
informations for estimating finite population quantiles. Keeping this 
goal in our mind we have derived the bivariate distribution of smaple 
quantiles and their asymptotic distribution. Assuming a certain super- 
population model we have suggested an estimator for finite population 
quantiles which involves the auxiliary variable. Further study will be 


required to establish useful properties of the suggested estimator. 


LV 


vy 
| ® 
a ~ 
Toe Ee aA 
a fi 
sLeabos co] ’ » yea: ainetye Abeveyal euabyetal 
at) 
. >. rrr «) grid Lome, aa eniaie nk agent 3 23079, “bas 
ai fan iwier @) aoiieriachbet ann del quake bite het ieee adh Jae 
f % rBqh sc wos, baverg SFM ON .Liatsh. Ad Londra Ss c 
= a ‘ g ‘§ > 
ant «=e netsaleqed edith}? 2 war bk Gree A CoB ks aa ae: oa _ A 


a 
“ 
lk fy: 


wrolitvews. 10'9ee aldieneg e¢d sgblgee of ace ee eta 


“9 + poides' es 4qnue aoltelo@eq a2 mis grr 3agi* sh 7" aire tae 


aE ov ang, ‘2 Bod 4 of sa rt t ia} ry 4 =siJ pay Joh evn aw St die Ray Pra ib 


mg : 
‘soquea ohityad a gaianed ooksod!zseth ol 4 sires mbons ‘trite 2954 


rotaclaaty seins? wot seadeesias 1 \otart. Ave sya2. oy ravom colaaetoc 


7 


; . ’ . 
: é = oo 
a4 “Ebie vhuse wetorS .efdaloev Pend Laud ents Hew Foe soidw eel tinage Y 
< : oe 
-2osemites buteswon# ei3. te spi sis4et teipen dertiosed et bastupay 


= 
My ’ 


ACKNOWLEDGEMENT 


I wish to express my grateful thanks to my supervisor, 
Professor K.L. Mehra for introducing me to the problem and for 
the considerable time and effort that he has spent on my work 
in preparing this thesis. 

I would also like to thank Mr. Andrew Luong and 
Ms. Martha S. Rhodes for helping me in proofreading and 


M. June Talpash for her excellent typing of this thesis. 


Digitized by the Internet Archive 
in 2023 with funding from — 
University of Alberta Library 


https://archive.org/details/Islam1983 


TABLE OF CONTENTS 


CHAPTER PAGE 
ui NT RODUG ALON. Verls Ws Mie Mists) Var lol e(itell ss veh tal ren ay Ne t 
Iai, SUPER-POPULATION MODELS AND PREDICTION... . 10 

Zo ENE RODUG ECON My ore? Mere ec cnetenl ccure Use aukisu' sits 10 
2.2 DIFFERENT SUPER-POPULATION MODELS .... Aaah 
2.3 SOME DEFINITIONS AND TERMINOLOGIES ... 18 
2.4 PREDICTION UNDER DESIGN ORIENTED 
SUPER-POPULATION MODEL .... 2.2. -s 22 
2.5 PREDICTION UNDER DESIGN FREE 
SUPERS POPULATION] MODEL 3 ls7 sic le! ve. Gs oe! bse 26 
2.6 ROBUSTNESS IN MODEL BASED INFERENCE... 39 
EEE RANDOMIZATION AND BALANCED SAMPLING ..... 46 
Denls me RAN DORR AMONG ease she Songreihy eo ient aot aiteapis 46 
Dea DAGANGED FOmMe IG UNG oe) el ey vellcaiiin Pou olathe 50 


IV RANKS AND ORDER STATISTICS FOR FINITE 


ROP UA DIONE costes otk ot toni sitet tell teal uet.6." oho oh pene « 60 
4.1 ORDER STATISTICS IN SAMPLING FROM 
PINT TE SPOLUBAT IONS caeoe aed cif s.. olosyb ies 60 
4.2 CONFIDENCE INTERVAL OF QUANTILES IN 
PINDTTE POPULATIONS 2 sole (seis es eth, oo 68 


4.3 JOINT DISTRIBUTION OF QUANTILES OF 
A SAMPLE FROM BIVARIATE FINITE 


POPDLAPLONG Ios Saves ion yiorti sigs Eset sue m sidney ait rat 
4.4 PREDICTION OF FINITE POPULATION 
QUANTILE USING AUXILIARY VARIABLES .. Wes: 


30h" . 


t ot, + = 1 «35? Ss es 


OF  (« . ss KOPrOTese oO mh SSN vomrarmDtegdve. iI 

[er hacia ES 

it oe o ER aD iheduaed<gnaue! seaside Som 

ei _% « SRO lowmorT m4 ater eo 2 ROB! Ett 
TSTRTIO Prem” 


Baga et le oa se a370M WOT NRTA | 


Pears Be ih om aoe ‘E65 er | 
as + « © @ . s+ & Be ey 7 - oe 
et . . » SANT GRAe AuGOM wT eeneReiadT Sk |) a 

1 é ; 
PY + « » « OMULIWEAS CUDWAIAS GHA HOTTAS DAMIMAR ThE 


Se - 
oe Cte ee st 9 ee eT eae 


‘ . ‘ ° w » ‘ o . * * ¢ maine ‘nat g, E t 


SYNE aon eorrarmate oat , “ vr 
= hie + = ja -_-_ - *£ © * o 48 a3 ° ’ A ; ** : - ; a 
? ; Noa 
MONS ‘petraene Ae 
Oa ‘© 6 & & iY © © 2 @ no 


‘ er REPAID th. sacar aus 
Ge . a eee 


e 


re 


ee acute 


CHAPTER PAGE 


V ASYMPTOTIC RESULTS FOR SAMPLES. FROM 
EINTTGSPOPULATION co ne Me ae 84 
EE CINTRODUGIT ON tate sh a eer cou rene eo 84 
Sey SOME TASYMETOTLC RESULTS" wa mec icd cet, bs 84 
5.3 ASYMPTOTIC BIVARIATE DISTRIBUTION 
OF SAMPLE°QUANTILES ste.) ck 89 
REPERENCHS 4 (ren seas 20 Clic uctvants Sen 93 


CHAPTER I 


INTRODUCTION 


In sample survey theory the basic assumption is that we have 
a fixed finite population of N identifiable units under study. Due 
to lack of time and resources which includes money, expertise, etc., we 
are constrained to study only a part of the population concerned. Of 
course, there are situations where a complete study of the population 
is not at all feasible and one has to depend on sample survey methods. 
The objective in survey sampling is to make inference about some 
characteristics of the population. Most of the literature on Sample 
Surveys dealswith the estimation of the population total, population 
mean, population proportion and standard errors of their estimates. 

The object of this thesis is to study recent developments in 
survey sampling dealing with the estimation of quantiles of finite 
population and inference under super population models. It is a common 
practice to use auxiliary information for estimating population mean 
or total. Analogously, our interest in this work is to study the use 
of auxiliary information for improving the estimation of finite 
population quantiles. For example, can we use with benefit the inform- 
ation on quantiles of auxiliary variables in estimating the quantile 
of the main variables? Keeping this objective in mind we have derived 
the bivariate distribution of sample quantiles and studied its 
asymptotic behaviour. 

In this chapter we shall try to point out some shortcomings 


of the conventinal approach and a brief history on the development of 


aS ee 


7 = } 2 re 


n 
J 


1 | 

due se any sl 0 . of semua orang wsta eres " yore al qian a | : 
sid .xbuidn sehas’aatew ‘ote Be oe ana elarann oitngt A? 
sii . cote . salsnocne , yoo sath omy fataw yaoswOest dns sets kod if 
*%.  Bemaecses ooiseluqoq aftr Yo ee as “ino or ‘e veniesgae 


nélzsiveas ode ty YLuse satelAeoa ® Stade sad ake ra seeds 
shadsen yore: ‘slanka no hasgdbh 62 Bay sind oer techno a! 43. 4a 
ever swote oragieesk wh a2 ek scnlghas widen vay oii | 


olarne® ao stuteterdl efd To Ta0M ee ala to satanes bi RS 


sikialoqod, .f8301 aotsaiugag ede 56 notion ee arid aw tea 


® ta Wy a” 
cA 


ey 


7 
7 a 


-soragigea Tiagt le erorTs Mabobta base Fera2 Sst 
ai eat latins gonnot ¢huste of 6 diene eid? to Saetde ae ie 
siigtt lo ealzanppp To cotzamkses oni. Aske ara gunicama% 


boanns e<L 2% .adiehen eorsn wag idque yeh) caplet tm wi BS 


soi 


neem mitetagoy gatianizee tot rains iy S| exh Gide gat on ; 
sax od3 ybwie @2 ak Hoey eds ad Anetwans ago ice ibid fda a ; 

aJintt jo norsetites wilt galvos yet rob pottemrioia® ieronltewi Xb 
“Hielst ad 92tamed MEY osu a aan vetypene “Ot see tiansup rOL ae ao . 
sitansug wy yaisenions etd golipires yabiiznin Ab lenttianup aque sa oo 
cme wed au hide ef avinsstde Siz) aakaaee Teofenruae obem ek — 
at} deliuxe bys saltonuly sigebe se eaandBNOS aap 


° agndwostrode smok 320 antag of via Linde ow a= satin | 


Sd ae rind Aaa ie sages. Li Mian + 4 


7 on eee 
lage de 
sae, ae a De na 


the model based approach. to inference in survey sampling as an alter- 
native to the conventional one. 

Let there be N < © units in the population. N is called 
the population size. The units are identifiable, that is units of the 
population can be uniquely labelled from 1 tto N and the label of 
each unit is known. We can denote the population by Y= Lie. NS 
With each unit i, there is associated a measurement Ys on a variable 
character y. For all practical purposes Ba isreal.. forall = "te Re. 
We shall represent the auxiliary information on U, by a real measure- 
ment x, Tree vector X.° Our target is to estimate the population 
total, y= ) Ya: The method is to draw a representative sample of the 
population aa on the basis of sample observations we have to estimate 
population total. Let s = {i,»---,i} be sampled units without 
repetition and Ss =Ue-s ae te be units not in the sample s. 


Then, we have population total, 


(V1) y = a ei eg 


In estimating y, the first sum on the right hand side of (1.1) is 
exactly known to us (assuming there is no measurement error) and our 
object is to estimate the second sum, namely the total of the non-sampled 
part of the population. Survey methodology available in common survey 
sampling textbooks, (e.g. Cochran, 1977) is devoted primarily to finding 
a good survey design, suitable to the practical situtation for estimating 
the unknown part of the population total. This approach is now commonly 
known as the conventional approach or the design based approach. At this 


point it is interesting to mention a few lines from Basu (1969): 


dadistie mies ontieeis arc Sartetal 02 peaear 
baises et % hos deEage9 ond of ayn o> sd sist ms 
siti Yo dake talt ,oidetiiseehs ote ea te att | asta im , 
79 Ledel sy toe voooF = ‘pont: ‘heliees! Ciaupine ‘9d tape 
Me... $0) SP ee weltetaqe sF asenab sus le moat a 
aidartay BHO 4Y Saenist ehum m7 agekooens at stad +f bests 
Wek te sf fesr x esanxng seausnbi ad nae 
‘onesie Laot ns yd 40 29 rok saint gigh font bbsyss le becial gata 
rokiahaqegy 373 ateigt2en-00 =o sated rat) +h see | ie es 
siti 30 eiquads evtasingessqe1 8 wath oF at byses aiT "4g 
staaises of avnd ey amoisevread> sLemee Yo sdene ody oo dee 
snot rind Solgmee od i | Soe} sn om ee 


7 


8. @lemae add ab gon erie. od tues pies Pe ar a ik 
Ae2o2 alee. | 
i 


=? 


at (07% ¥oihee heed dder 02/00 wom st sessed 
tu6._ber 972g Snipe Taete on Gb dees. getanten) aus awe! et 
bitigese ton ats seinen soe aneoeae al 4 
vetine Booms Mb ateaEhewm xag tabeRDGE Norse 4 anes te 

ankbal? o eticemtsg uedoweh at (TNO yap nasOD ym: yo eed 

: bei ey 

eos cotasuite Exatiotrg wifes Sis, yi 
Baie ae cenonign tt Aer anton 


"The objective of planning a survey should be to end up with 
a good sample. The term 'representative sample’ has been useed in survey 
terminology. But no one has cared to give a precise definition of the 
term. It is implicitly taken for granted that statistician with his 
biased mind is unable to select a representative sample. Soa 
simplistic solution is sought by turning to an unbiased die (the 
random number tables). Thus, a deaf and dumb die is supposed to do 
the job of selecting a 'representative sample’ better than a trained 


statistician." 


Broadly speaking, there are three main methods of estimation 
of finite population parameters. These are: Methods bgeed on 
1. measurements of units which are exact, that is, there is no 
error in measurements; 
2. measurements which are not exact but subject to random 
errors; and 
3. knowledge of some process which generates the measurements on 


a given unit. 


Traditionally, randomization has been regarded as an essential part of 
survey sampling for objective inferences and estimability of the 
standard errors of estimates. In Case 1 above, randomization is created 
by the sampler through specific survey designs. This was the approach 
adopted by statisticians in developing the subject of sample survey. 

In this design based approach it is assumed that the population values 
Yyrrr Vy are, fixed -andihence™y = {Yj o+++9¥u} can be treated as a 
parameter of the population under consideration. Our interest is on 


some function of this parameter say g(y). There are certain authors, 


; Pal : | . ae iP 

a wn to ) mht 
itiw qv igie ot od kiuwie qaviee = glimpatg’ trgetde a 

asta nl powrep cea aot ‘al gene ov itesoetengs: ' gawd, act 
aie 34 wisk $ kink taitr’ sci4ayq © etig od. teee> qiet a0 oo aah y 
ge t22w wadotsetinte dads Aepieesy FO). ocd vishseea ant 
s ot eh ore ov Seas ges & dsniele oO? Shiba 92 6 | 
ati al} beast<w op a pris 4 atqueniat metaiee) Ns 
oy a? baevader © alo Sek. Sia , we & 5 aie a ids: rad 


bante<: & saail Sad ved 52 setaagnegsraet” noniibicies a 
, dh Gene 


ys Xe 


wnkSuea den iy ebotfiem fies satay O36 a git farinegs Yboot’) 
| ve bitised ‘etvodsatt cots send? , , areeaaazag: mS Rape. 
of WE avetd ef sar , sone evn dciite aciow To eTethind-tuemegint * 4 
‘etaemetustad nf eta | 

gobi 0s Boal due vue yokRe Sem wkd itoldy cam 

| wid enh | 

i) eipeeoi~esad 447: eonsteake doide eeosose ‘m@itea~ Io. onbatinure 


ae ete 


i> or6q ist gees rie a hier eee #20 PhS ae 
wits Yai chi EPs tid Waa: Serna te | ar : 
heveess af coldeshedhies .arak T aetd wt. eoddniteh) te enodtsa ie | 
samen atts woe abit .eiphlagh yates. 22 Tb=o yh: sient tne sa 
ceitat stipe fy ante WA grinaeysin at anibasvtante oe wasp a 


hecitas S08 saloqon and dads bocweeh yk 2} pees senceilan a 


ae 
Pd pines? be nih ipSes «+ayghh ~y Brel — “ e 
7 ales snes“ neizetek: sikeie sedeh Mn 


for example, Neyman (1971) who tend to focus attention only on 
estimation based on man-made randomization in the form of design. 

In Case 2, there are two sources of randomization, (i) created 
randomization based on survey design, and (ii) random error associated 
with the measurements of the sample unit which is commonly known as non- 
sampling error or bias. This Latter aspect is beyond the scope of this 
thesis. Modern development of survey sampling techniques mainly follow 
the line of Case 3. This is widely known as the super-population approach 
or model based approach. Under this approach it is assumed that to each 
population unit is associated a random variable for which a stochastic 
structure is specified. The actual value associated with the population 
unit is treated as an outcome of that random variable. We shall discuss 
various super-population models and methods of estimation under these models 


in Chapter 2. Super-population approach is an elegant development of 


statisticians, through which important new methods are currently being 
added to traditional methodology of survey sampling. Authors like Barnard 
(1971), Kalbfeisch and Sprott (1969), Royall (1970, 1971), etc. consider 
inference based on super-population models not only desirable but almost 
necessary. Some of the authors have strongly criticized the idea of the 
sample design producing the only source of randomness in data injected 
by the survey statistician himself. "The survey statistician does 

not lean on probability-theory for the purpose of understanding and 
controlling the mess created by an unavoidable source of randomness or 
uncertainty (observation error)", Basu (1969). Basu examined the random- 
ization principle in survey sampling and came to the conclusion that 
there is very little, if any, use for the survey designs. Chapter 3 


deals with the randomization principle and its alternatives. 


m yipp-coftrbs38 -auyo? oo @iiee ate ‘CHER 
mis 20 acs or wy rohiaetnohnes some 80 6 8) 


7 
=e 


rsJe52- ; net? a 3: ry \ COOTOn Ain wre eee | 4 9% ui * 7 
; ids eae 

Daiesess tovrs sri ry) Die ep bod. eavaua uo bears agksass 

> ‘ 


= 


fish Ped ~ i ee »hiw 3 é72) anse ons ba) naan % } 
an ¢ 7 ; , “Sener h seo 78), Beet hee ace yorrws. s 
is 
at pauotoiess si 7 eeteqringie! Fovcca Segal ev aes aay 
: an a 7 fis 

“- 6 eis i ns g- Tag, Oe ts weeble al Ebi? 2 \sas940 SINSS. 
j i = 

bedarnes : r vi rQGErVeen? Teo .t9n04 qt peandctalag 
Sidw rot abdecrey mObRates Stators as | 4heur y 
eioy dehtie ahh Pott toage Met: aa 
£ec> >ioe Ge aa omielryd 
4 ‘eae 

, jis 4 = 2° ehey. mili etal alt a ral 

2 
= é ‘ 

haf vpy Withee lo re 2c Pores yod Aerials are ( ae 

i: ae wm ‘ paw ree i 7 Waka 


cA J ihe, wae! J #30, Yoo tshartyae ee ot eet et 


eed b- ia JaTTeh evar iat COR . (yes) Sdoxah ad uts 4 iat: » PRED 
' Aol 


C28 700 8 o3.t Bel vm It=? Sige sets Sse Qh f~ 34g) Hy a? bend ‘9p0 


‘ . Mi : j So : 
eb) ade Let Gotsies -wlacotte ava d ereAtin ore) do: ames. 
- ) oN 
Peon 4d ATED GS Sualines Yo Shsieuy' tog ehh Sarsuboio cra E whl. 
: _ Se 


eae .seloSgehasre yavwue mill" ot ieemt ralostabtnte, ysviwe me - 

26 JacMonriahes le savas ois ac Ogee eee a) an rok ame | 

a> naméazoliee: 2) write “Lda her unar m at bes pores an ik ad3 seixecleaie ; 
~mebuy sii Aonéeper uahl ./POCI)- peek .'Cs ors ssadbeyasudoy yon re ss: 
327 @hteakars 963 03 sm Spb geblawn® zermpalist Algtsiie 

E yecces sap iyi or iy le Wet Oty . Yon 2 / | ebay. yaa a 


= } 
. MMvisecivsis «] pg at hedese 


Although, we have called the common and well-known approach 
of survey sampling as traditional or conventional, the idea of super- 
populations is also not new. Cochran (1939, 1946), Deming and 
Stephan (1941), Madow and Madow (1944), Mahalanobis (1944) are early 
users of the super-population idea. Deming and Stephan (1941) were first 
to clearly mention the idea of variable status of the population, 
rather than fixed. They made the comment that the census is a sample 
only and suggested that it is one of many populations that might 
have resulted. The difference between census and sample survey is a 
matter of degree and considered the population census whose state of 
nature is changing with time. Cochran (1946) first clearly assumed 
that the finite population we have at our disposal is actually a sample 
from an infinite population. He considered the population in which 
the variance among the elements in any group of contiguous elements 
increases as the size of the group increases. This type of population 
was also considered by Smith (1938), Jessen (1942), Mahalanobis (1944) 
and Hansen and Hurwitz (1943). Various mathematical models have been 
considered by these authors for representing the situation where the 
variance within a group is directly proportional to the size of the 
group measure x,. Cochran (1946) considered that elements x. are 
drawn from different populations and assumed that the population 
changes in some regular manner with the value i. Alternatively, he 
suggested that x belongs to the same population but is serially 
correlated, and found it more reasonable to consider the finite 
population as a sample from an infinite population. 

The idea of super-population models i.e. the idea of 


considering the existing population as a sample from an infite 


: Ea . 
dseevdon ewonsul low Jne Nome Sa9 4 baétao eved. a alga ° - 
~“yegue Yo asot oli ,isnolin nto a0 fanatstbond Gh vn bbianas A De ad bo xu 
J ‘ ; . a # =) sel 
bas xnutmed ,fa0e! CREE) gerda0d ower jon offs ek rota que 
VYIIED |e Set ? = } Joie leds Py fag@Pt) We hsm Bais! WoRBs ” aay AS it st 
fe) ra | 


122 syew (IMCL) nlqaye bam gE ees polls slrgegeyeqha, adteten 


seorksadluaqec o sutete sidéilnav Jo sSbi 387 golanso eres « 
4 } j 7 el 


« 


7 


onas &. ef evan : 2507 teat srs shew ‘youT bess 3 ia det 


g'G 2hns anokisivqo Yaewete aged, al At Juds, Bese odaye baa ¥ 


4 
es et vevuue eloase*b SANA mBé vied DIST St rahe ad? bed tapas 


hl 


twaly sarki (3680) nariaoO* 6029 iaiw goksesdy et ovb ae 


(Picueoe eb AVGE LD Ive 2h, 9yoe Sw notdnives@ ei boi Shs gee 
” > @ s LS 7 > 


"+ 
JR4luqeg sid bexsiteanos SH .pelcteducoq stank ine (pa alos 
oa Wi 


; - j 7 7 
et iets! 2 mvettncs Io groty rym ot 2dosmeals, add sname SOCLLIEY Aaa 
1 e ‘ ; = 7) ” \ = 
: 1 19 ecyi re 242RpTI£1 quote att, Io welts ethan hand Sey. 
! *) } 
= ~ ee ‘aa Ld ’ Sonmf* _ , ¥ : fe > 
“(04 Atpe (Sel) meeeal .(68¢i) tilwe wdtibesseh bene: OLB ( 


7 i 


7 5 => 


4 7) ; On Te oe Ty > i ol t ,ype \ , Lo? 
Gove elebnm Paoitansiven svertey «~(faet aaherah bee sosne he 


oi s'anw il! si: ke ate PEP @ oe cy Stade Ot Aut oaceds § v4 Se%a 
: # Ove 4 he eg >» Ad tnd Agu 4 

eds 10.2558 63 oa at as VAS ISILG see: queta 7 32 dob soma 
Vg a, 7 .7 


. ee , : we 7.) = » 
StS 4 eahtudte. ba feraslives( G1) MOO Ee shuasse qocra 


a 


* - is - ° 7 
cebabiaabg no 88s oo ete ‘She. 20 alse ‘ae vated Lib oot Twezh: 
. ) : 7 7 *. _ 


- iN a a 
af fiawssertrsstR vf Beas wii aca ; Sirs ag bas Sees 


5 aoa! al qd iobiluang | 
7 7 — Wi «4 rie 
1 Oy eckmtl, ae 8 = 


Ye gob) le : sak 


ean 
nie 


population, started much earlier, unfortunately, the theoretical aspect 
of the model based approach did not attract much attention from 
Statisticians until the 1960's. It is well-known that the use of 
supplementary information in estimation of finite population parameters, 
in general, increases the accuracy of the estimator. So, samplers felt 
the need for a comparison of the relative accuracy of sample designs 
using such information. This comparison becomes difficult if we cannot 
assume any functional relationship of the data. One solution to the 
problem is to regard the finite population as a random sample from an 
infinite super-population model having certain properties. The results 
so obtained do not apply to any single finite population but to the 
average of all finite populations that can be drawn from the infinite 
population, .(Ray,. 2950). 

Early works on super-population models are based on some type 
of linear regression models with heteroscedastic error variances. 


"chance set-up" as a logical 


Hacking (1965) has proposed the concept of 
Superior to the postulate of a hypothetical infinity of populations. For 
example, the linear regression super-population model may be viewed as 
defining a random set-up rather than an infinity of hypothetical 
populations, if so desired. But, the analyses are mathematically 
identical. Forman and Brewer (1971) have given comparisons of the 
efficiencies of six methods of sampling in common use. The model they 
used (also commonly used super-population model) is an infinite set of 
theoretical populations, each of size N. Units are identifiable, 

having two measures Y, and Xx. on the ‘ah unit, where Y, is the 


measurement of the character of interest and Xx, is measurement of the 


auxiliary information (e.g. size of the unit), and related to Y; 


woaqes Lsoltsetours sit ‘anadie fotos sai be 


may? mpisussts dust asta sr 
; 10 Gee att Soa2 eee Fe at a 2880 ais iE 


7 . ' a) e 
eresquoieg poisrludqog s3tnk?) ie nl eaw ives’ alk not mo 


tiet exilguee .62 . 107ultbes Mi 3a \opTad De potenti 


. uy i : =f ay es . 
eugiveh sige Jo yoarook edagien 9ns ‘tt abiaen & 70272 
(eA i 


jum? av SP Jit) wae: ae NET IRGD9 ably ea 
_— - 
e o 
Sf9° OF an ide ok  erehcos 3 apneni ha 98 ‘ks jis 
: 


bs F 80 = SAGES aGOheet £2.24 gal Beate: sit artis hits na 
eiliauey Sd! ,gsTazegivsq Algiges ge ne obo io ste Aaa = que 


1 02 Di mi btetegos 2s Fate oft it ebs be ena ale 


7 eee % i 
ISL M old mont nes th od) as 4 igs stares ona ile es 3 


i, 
7 ae: 
«6a ie i 
% >) We é 
Yo. see fo Shesd tak: elaine Ata twd<2aque nes sasha, eae 
2 Galany. aetas Sk aebb aos youn) dae chain, eA 
S2ieol 5 ae" tesa sqasin” 4 seh afi pho gptig a {geet 


; Ns 
70% -s Shula Ain de Vining Las itodjogyil & io aha no 


b4awaly ae.) 


cA 


vilesisamisr os 4 as eal 
mid cae ee pied ten 
ous 20 inne taadees" eats ove user 900% by 


vind nepal on “ > 
Nie | 


rt 


as follows 
Ci?) Yo = Hr 8X Fe. , Le ee rate gs 
ah af i 


where, a and 8 are constants, e,'s are random variables with 


E(e,) = 0, E(e‘) = of (sometimes E(e‘) is some function of X.) 

and ES cy =0, for all i}? *j. “Here, expectation, E, is’ over all 
hypothetical populations and oF is constant over all these populations 
but varies with i. 

During the 1960's, statisticians have devoted much attention 

to the theoretical aspects of survey sampling. For a long time there 
were big gaps between survey sampling theories and statistical inference 
theories. In the traditional books of survey sampling, authors used 
the statistical inference theories under the assumption of large 
samples. The maximum likelihood method of estimation in statistical 
inference was essentially (for a long time) a failure in survey sampling 
situations. If the sample is drawn with probability proportional to 
size of the unit then how valid are traditional methods in the theory 
of hypothesis testing or the theory of statistical inferences? The 
answer of this question is still unknown. However, in the late 60's 
and early 70's it became possible to relate likelihood methods and 
Bayesian methods with finite populations. Some examples are Royall (1968, 
1976a),, Hartley and Rao (1968, 1969), Kalbfleish and Sprott (1970), 
C.R. Rao (1971), Ericson (1969a, b), Solomon and Zacks (1970), Basu 
(1969), Zacks (1969), and Godambe (1966, 1968), Godambe and Thompson 
(1971), Godambe and Joshi (1965), etc. 

The most remarkable and striking development in survey sampl- 


ing theory during the 1960's is the development of design free inferences. 


eRe te » F P at te" 


»! | ; fa 4 - 
(‘Sis pei ¢zeiaza? aggae> SIS. aris 92S DRI EDS 2 « a) vas é 


: = 
; . 4 Ns, 
| Ao rotsomysa ante «4 ‘os Seni seeee). | opus oa 2 = ee 


x ; 


fe vere el .3 .natznamdyy Spiele ge Ard WAie sas Tae (eR 


a 4 \ LY a P is 
; 4 pee o¢ GC ear > @2o¥ Ties who nj ri te enc asuqog Le =. a 
Leas 
2) “Giaw cate a 
: : 
sty) 730. dot bedovs 7) creJtacoeiieak: pa’ 0082 $64 gota, 


ols Te vl) ores gba Se eFohies Thbiseiouts 4 
is is5i7Risnss Das. edits seas nial Nar Hashis¢ Neder fs 


ee 
-enlitere vevive to ehotd fgncelareare ‘sh2 al Re. 


rib2s s@) sohaty seioeed? guteaaial  lesiget ED = 

; ; a t. i : a 7 

*- ny TA a Pee aS 5 SullP aie b 7h. ike | Ma 5 ie per = Hf 
; Ris cs 


ii terce ys Mew athena Liat mai aaoy a Dy) 4 rtasspeuhe ‘Sai! 39087 4 


s > 

(Idol sda al abadjet ilqqhiathess sium beday ase om a ap a 
=~ } ~ 7 

~,, OT kos 8 2et ‘nut RIF Te 75 Rah utd yO gy 4403 webiagy 


Lar P3000) be ri. reedns¢ (22a .heish.-61 ofunee nel? 3 :tgat 


= 
j 7 
, 
> 


"G8 aia! ote Na fan seth inedraton ittee Jat “pubsueop aria 5 SWEROB, 
Wie ohn in  POGET ISL! ageless G3 5)tacor agaysea.. 32 sara ust 
-EOU]) En yas wm), eaiopeees emyd anak niuaog siin ty Wotw- gboni hie 
otUe Shs 2203G4 "bas fstal> Aree o( MOG) .2ges) oad Lite vase | 


seed, (OTT) etre? Mow wosiake?., Ui -2PET) nowebatas, (ECOL) or 9 ip 

ieenetuiT Aen wdeshod: , (Mtl APE -edasbon fom accion ey, 

a rs sae Lteaes) raeol Sim ben ¢ —— 
he 5 

Lees sae hh remystaya er a ‘Kae. afiatrcasea§ 

naniree ape Sylva 35 Poem cand aiksa3 NOL is : 


rs com ; —- : a 
— We Fe Ao FF es i ee 


There are some strong critics on the use of survey design for inference 
on finite populations. Godambe (1966) noted that the application of the 
likelihood principle in sampling situation would mean that the sampling 
design is irrelevant for data analysis. Basu (1969) examined the role 
of sufficiency and the likelihood principle and gave the conclusion, 
"Once the sample has been draw, the inference should not depend in any 
way on the sampling design. This poses the problem of designing a 
survey which will yield a good (representative) sample.'' He also 
examined the randomization principle (the man-made randomization 
through survey design) and pointed out very limited use, if any, for 

it in survey design. 

Carrying this idea further, statisticians in the 1970's 
started suggesting the use of subjective sampling for an optimum 
estimator. Royall (1970) suggested a subjective sample, called a 
"balanced sample', for estimating the population total. For estimating 
population total his estimator based on this balanced sample under the 
assumption of linear-regression super-population model proved to be 
most efficient. Brewer (1963) first suggested this type of purposive 
sampling. Later Royall (1973a, b) studied the robustness of the 
estimator based on balanced samples. This idea was further developed 
and extended by many other authors, namely, Holt (1975), Sigha (1976), 
Mukhopadhyay (1977), Tallis (1978), Scott, Brewer and Ho (1978), Singh 
and Garg (1979). There is considerable criticisms of this type 
of purposive sampling although the mathematical basis of this approach 
is sound. However, it seems that as yet there is no conclusive 


decision on the use of design based approach and model based approach. 


t 


ssqetetal sol netesh veews J ouy goth aossiae g207I8 
git? Jo cetteoklaue a9 todd beste (@O8L) sidehbae eon. N 


\ : ; 
antigens a ted? coen bivew notmaatle gehignse ai elqxreares 
hepteexe Ceélr> efe8  sahetbenk’ &38b iA Siete 


telvatous edt evag bie ate toning boaitkietsl oid bak ¥o1 


z 
v5 ps Sui ores. tor bivodla a yrrespeed and teat? yale xb higeit aad’ ati ; 
gctigieeh Fo:maldank agT elagg eit. ° ,.2geeae anki gree, ons, 


> 
e i - : 
pie 


os 


gate ot “.sloune (ovl oonsgetqae) bed” 6 va MMi — soe um 
notzexigdbass sabes nsw id) augidnt td aqidastepbrer oly bs 

’ - iu 

v2 jee 34) pao bowtieks ster cathe 26@ ge, (Agtash corms Magee | 


tgleeh Yeurae") 


2 OTC) efy al eoatolsagses® yaasotud ob) iad, Balveae? 

ELD 7, 1S TO2 Barcgrye avetoetdie Io) gay Sh eattepuy sees 
4 / 

befics, elamae svi ssei dvb. 4 bedsagage (oraz) 4 ieyod oe: 


eclonuzses/ S39) «lestd noltefvape ado Sal see geo aan «| Strada soangd 

. i 7 ’ es 

a43 obras stage ips tied whpoced (bgatd soleehsee etd Jeiapt ap ta ely 

SC OS SevOsq Jeover milage yoge weg) az i 7 ce to reel 3 coma 
Vera 


7 "4 <y 

Oe a 
J 
= 


er taairam: 34 ont enc baponne 1ay5T (RORTD acide stoliia \ 
et? io onecied tied in wid habia (i. ae ROt) Liaghs feat eens ie S 
pagolaveh, ys63 gm] ape _ mii. Sees i isi aha An, _ bennd eae " 
aren siai2., {efel; hot Siamee .eren sue z5eJ0 rae “8 ssivanie oad: 
dugke (2802) wl Late veeeria trapon \taceh) ebtrem e(tter) | ei, 
69%2 Ald) Zo. exebott isn siderehsages uh saat *terers pV. 
dommeggt <ig= Re akyed Lentz emanate Agoossin gebiqusa selsarig Se 
svivufowss én @) sandy gsy 26 &% 30% onbus 3b Ci ta er a ae 
E> baees Lobom bit dasbagen ne saieod opr 3m — 
7 


Both approaches have some merits and demerits. Some authors are trying 
to mix these two streams. For example, Kolehmainen (1981) suggested 
that stratification of the finite population should always be made, 

if possible, and sampling within strata can be made purposively. 

Basu (1978) also suggested some type of post-stratification of data. 

In Chapter 3, we discuss this matter in greater detail. 

In Chapter 4, we discuss the use of order statistics in the 
estimation of quantiles of finite populations. There we have given 
some results on properties of the distribution of order statistics 
in finite population sampling, bivariate distribution of sample 
quantiles and estimation of quantiles using auxiliary information. 

In Chapter 5, we discuss the asymptotic behavior of some estimators 
of finite population parameters and derive the asymptotic joint 


distribution of sample quantiles. 


a4 


gates pom exculai woi ey sane | ew 

besumpeire (OMRE) oattediebor *y atgaane, 7s. — ; 4 
abut. oc enesto hdutitts rent paikargriqy oxtail ia Zo 

iwkeegied shine 3¢ nit Roepe mai evens ‘abla sale 

week Ye weleng lize) snag ae ee banaue vals 

last weetty ahaa ee oe He Fede 


ugvig.srad oy etadi i ed mat? re — Hin 


asttelzats tahet te sul adestindy) ra clin 

eigtua I ooemiizset: ace Miata duakigese hie ds 
endamervhel penkittsn gx ied cot at se 
writeetids tad 10 soivedag oeauness “ails aeaaeth' av. 42 is 


fealul .+165sdPrae eds ov) 2265 Lin onan stoi 


AS 5 4, 

4 { ane ; 

" 

‘ = 

. ae : - 
to 

- i x ay 
G 4 7, - i o 
j wires : ; 

a ary 

—_ 7 a _ 

° = = 7 

: = wu 7 2 

a y = ad ja : 

~~ $ 4° > 7. 


CHAPTER II 


SUPER-POPULATION MODELS AND PREDICTION 


§2.1 INTRODUCTION 


In this chapter we shall study different types of super- 
population models and sampling theories based on these models. Ideas 
and write-up of this chapter are mostly as in Cassel, Sarndal and 
Wretman (1977). 

In Chapter 1, we have mentioned that the super-population model 
arises when we consider the measurement y = CFrigsais a of a finite 
population to be the outcome of a random variable, Y = (Yyose+e¥y)- 


Let-us denote. the joint distribution of 9Y. by. €. Before proceeding 


to the next section let us introduce some useful definitions. 


é k = Moss 
Ordered sample: A sequence s (k, > ok (gk)? such that k Bare 6 
for i= 1,...,n(s*) is called an ordered sample. The number of 


components of s%*, denoted by n(s*), is called the sample size. 


O 


Unordered sample: A non-empty set s such that s Uh is called 
an unordered sample. The number of elements of s, denoted by v(s), 


is called the effective sample size. 


If the context is only with unordered sample, then we shall call 


unordered sample and effective sample size simply by sample and sample 


- 10 - 


assque “ko. psayy atnase2 thy vile, eaNe vadainds lene ‘ 
eedh! oleton sans ov ate aeesoingD. opt teats de Saale 


‘nhwitur .isseed al. 2e°vigrowew pratgkad any To 


: 


felon pvisaleeoq-seene st) 1 ot! wepodes ‘ews bw ok: See pe 
ire #) a 
esintt w 3e (idee. f(a AgeaD Re HE 1ehtende pelt 


(oes oY sitet nha’ ® 3a tbe ea 


rq TiTdes o> ed ¥ Fo qstoey stibsset, sighds ie? Per 


» 282 Fat bt ype) aos, opUDos ou! an qabenens 


} 
. 


1a , als as : 5 va ; - 
a » & 20nt dobs ee ey) we ky Peace matachar sah 


im seced¢ of (= lveke baiabro" na! hs Les at : Riis ora oor 


oi 
o_. 


tnle shemie sap baiiaa @ , elm. wie Be: Sta 46 


@ 
@ 


beliss a2. HS ® 28s dane. 2 748. 73 Ges-AOe A $84 


wre 


Aaie ed'baemb .a  3¢. aunts to! settine sat sigas jaitel 
4 


| onde alpine wae 
5 


size respectively. The set of all sets s will be denoted by y 


Unordered sample design (or simply sample design): A function p(s) 


on ed Sati styings. p(s). 420 stor all’ ss Lye and ) p(s) = 1 will 


be called an unordered sample design. Some authors refer to the pair 


Cf, p(.)) as the design. 


The definition of the ordered sample design is similar. 


Non-informative design: A sample design p(.) is called a non- 


informative design,if and only if, p(.) is a function that does not 
depend on the y-values associated with labels in s or s*. But p(.) 


may be function of auxiliary variables. 


Fixed size design: If n(s*) or v(s) are fixed then the respective 


design is called a fixed size design. 


§2.2 DIFFERENT SUPER-POPULATION MODELS 


By super-population model or simply "model" we shall refer 
to a class of distributions — with various types of specifications. 
These specifications may be only on the first few moments of € or 
to be more specific we may assume & has some specific well-defined 
Statistical distribution. However, in both cases it is assumed that 


the vector of finite population values yan GeO ee, is an out- 


pb d 


&. vd bascost od litw = ase Sis to see wet 
; “ 7 
fa,q noelsom? 4“ +s (awlesh slammed vials 30) mies ete 
ifs [f= tera d 7 te e ie tot .0.<: (ese a 
bs : ua 
Toa ercultys Soak —_e ‘aidene pavottdiain 


ea baal’ dian: / 008 


74 
’ 
allie et, oy test! ah dee shart. to sodstae 
7 - 7 
oH OGLIAD © LivG $1295 of ies J, ioe 
anub 28A2 .Afobsacvl pe a! Las 3 Gipe = hs Sioa i 
a a 
3 t ‘pfedaD aks bss tades 235 beni aay ould no s 
4 2 - jin 
aGTAGES ay wisi cae a0 | ahr 
Ny 
‘ ; : 
sWk INAUe 2. eons teat? sta fe)w te. (Paes - Se soli 


LAOH NITE cats egaarain 


29304 ILore ow "febrei" nee ao totom not ty ah Dia ya 
-ermites 8 hseny Fd apie decticus thw | 5 vantaaranty 3 ta eatery oe 


7 3 do commod ws sett? 23! no leo od ‘on ease c R 


dams Demveud ef 7S shuns tog A reveuen HeSgird 
sous on oh hgh st ayt) “% | e6bley notrmtigsg 
2. jee _ ef 


2 


come of the random variable Naa DY. » Ly) shaving distribution €. 


1 ny 


Definition: If Q= QM ose o¥ Deots-antuncrronsotl | Yount. oy 


N 1 N° 


E-expectation of Q, denoted by €(Q) is defined as 


(aps EQ) =. fade , 


and variance of Q, denoted by WVQ) is defined as 


(2.2) Vo) = fea) 17a 


If Q, ~ Q, Fy) +++ 9%y) and Ly TR OS DERE are two functions of 


¥ the €-covariance of or and Q,» denoted by G(Q, 52); 


Ypres Ww? 


is defined as 


(2.3) CGieo5) =" 110, 2EQ)) 1 (a, ide 
In particular, we shall define for | eae Oey Ie 

Hea eC ceieuy Ce en lo = Bye y,) \ for kd 2), 
C28) 

N N 
- ai = i 
i= — \ u and Ye =e. ) x 
N 1 k N 1 k 
a 


There are two broad classifications of models used in survey 
sampling. They are (i) general models, denoted by G and 
(ii) exchangeability models, denoted by E. Often there will be 


subscripts to further specify both Model G and Model E. 


aT. uy 


; —- ok 

3 ‘tmtigttweth gatves (Fe 4 wie EPO aidalvay cnrbaats 
he : ma? nate | 4 . lc nar 7) re. 4 at vee eé + b2D = uP i a 
5 i) Ae , 
ce bole ak ApVR x6 Besivnh' (o To memes 


he =- (2% 


7 
. : M 7 y : I= — 
Lela © ier fs ee fate al ou! ‘= oo. | 
y i ww ; ri. te { ¥ ” yrs Pe 0 : 
: b en rsd Coty P wit 


0 93> ewiiearnie) allt yo ams 
m! : AF \ : Wr - 3 
1) See eR 
a6 DOLL 
B 
* . & @s > + '] = 
af eh! hel ok ln Or tie 


® 
sd 


Le - ae | ion 3>a eb l ee oy ih 4 4 8 


(svaga a} tone lett bo endidgobiizes!: beow? 0) $38 S26dT 


bus 2 ya henge satotow Emcareg (EP) sine yeti 


oe Tite rede ‘n5tt%. 8 ove CerahsS shoe eck eaomimatinge 5 +) 


S (eal ban 0 PAM tno4 Atenas adem ad 


La 


; é : 7 Oe 
‘ey ; i — _ 
- a @ 
ais 


rs 


Model G (transformation model). This model specifies the class of 


distributions € such that, for given a, > 0 and b the variables 


k k’ 


: 2 3 Z 
have common mean U, variance oO and covariance 00 for any pair 


k # 2%. Unless, otherwise stated, in general u, o? and 9 are 
N 
unknown, - Stn ela and \ a, = N. The condition on p is 
N-1 1 k 
required to have non-negative WY). Therefore, under Model Gp 


Ye has the following moments: 


en ey = au or be 
Dns pga n2 
(2.5) Cae ee) avo 
2s eek aa pa 
Oe icf (Yo ,) = tye 9 k # 2 


Model Gp implies that the first two moments of the transformed 


variables Z aN are unchanged. So we can suitably choose a 


er: k 
and by for specifying a good sample design for the problem in hand. 
Model G aes The special case of Model Cr where a. = 1, by =20eefor 


all Kk 1s. tig Nets. Moder Gro" This model expresses that labels 


are uninformative. 


Model G eee eae regression model). The class of distributions 
suche that Yyoree ety are independently distributed and 


ah ey: 
TE cap MOAR PE aa hoy CR eee red AOD ce nae 


Be 
wenrroi#ons) SA}. I Paeang ow. Sands od} etl i ba bal go Lane 


1% aealo ont eehiiseqe Lebtw ahd. s(labom nbtse 
2stdaivayv = «36 bes f), ;* ie fovin rot jens Howe a 


= c ; 
ting ene 460 sabi avOo BAG ~R- sgaetyoy « ~Y fet 


= 


i] 
bs Ss ray : of 7 
Y bem | ‘ST SNeR EE vbateie sibwisive ~aeela 7, 


| ? if 7 
: y 10 fo 272d iG i. a t afar o - re awe 
y) : at ‘ + : : aes [- ae 
i i 


* wank Fu saved. na) 


et: Snuit mrrhwotso2 oe en ; a 
! : i ry ( 7 
\ 7 7 
js uphs - d 2 \ = _ 
i r € ag F . q 


ey 


ii eools) whdey kus oa Ser of bomnsk dna Soe yee ne x any 


tai nt aatacigapnp thi oka ares’. boos {gb of 


Pe IOY 6, oe Tersans espe ante (ort the 


wk Aol, 7 7) = 7 a im M ' : ~~, 
2 mit muneeytets Labeur aes os: folhom a2 Kecreet = ah: 5 ay: 

G Aap : 7 _ 
' aviieiolan age 
: : 7 2 = 


yee a 


apotsudingeth eabl> sit : {Lobo rteteasery 


4 ie 
hind Letodiysest J TaehNSGebn! sth Bees 


Ed “ag® jh rer. 


ee % 
aie | se a 


14 


Zz 
where ile agate and. are unknown and Wie eames uy is a set 


Of known numbers forall |k, 9k >=) 1,...,N. 


Model G. (ratio model). The class of distributions — such that 


Yiocceo ky are independently distributed, and 
w= 6) = be, of = WE) = one), bel. 
k k rigid k k: nae ’ geee WN gy 


where 8 and an are unknown. u(.) is a known function and Kyo XR 


are known positive constants. A common assumption is u(x,) = x8, where 
g is known. 

Next, let us consider various types of emctinecab meer models. 
In order to be exchangeable, the distribution & must be symmetric in 


accordance with the following definition. 


Definition. Random variables Yioreroty are called exchangeable if 


Niwa ic es gl have, for every permutation 1,,...,1r Ofer Liisa N ete 
ty Ty if N 
same joint distribution, which is called an exchangeable distribution. 


O 


The idea of exchangeable distribution in the context of finite 
eet? y themselves 


may be assumed to be exchangeable. However, it is usually assumed that 


population was given by Ericson (1965). Variables Y 


the transformed Y under change of origin and scale, are exchangeable. 


ea 


Model E_. This model defines the class of distributions € such that, 


N 
for known a, > QO and bys ke=i 12... .N. satisfying ) a= N; the 
random variables 
Z SGD ae : lea Oe Ne 


have an exchangeable absolutely continuous distribution. Common mean, 


j | ie c. < 5 e 
fea a 2k i tg ghee sn ep bes mumior sce 289 hue one 
Kyiss.t © A 4h The 203 


set dope 2- exobozutiittelb 29 meets of Ane otgny) N =f 


bas ashateoede vi jth wae wen 


+ c . 
a Fieth ad » rm 24 gs oP | a. Pa : +4 as cms 7 - 


| Ss s 

” Se te baa Heluocet muons =o 1.) twefedeu | ste b bee 7 * 
a '' 7 ; ee q _ 

® ‘ = ; 
ater wom | ely } melo Giese aes A “Poe uve? Rog f 

4 a 

f 
om woilbdssamulies Jo esa: aaoktey 268ienso) ay del Geet 


tdeamye aS Seam 3° aoljudtsieks am SV Gnagontints od BP 


nokt totis® vbpied fo® sis dolew, a 


; i 
pi) 


We gig 
i? elilgsganddup bells> ez ewe esidsithy rebredt she 
v4 i 


e839. Mervegk 36 gre nie po lss7@0rao. vreve zal OV DH ghee 
= mn; 


= * . a ; PT : 
noltedrixeld al desgtisdoxs of belico at tebdw (neksuvdrlgeds sake? 


setali to axedinen ef) aE ootoudiroct olwnambpaAa 8a 6 sa9bt aa a 


BOWLS Sewnsty is pi saldeizey ~ (260i) 58S Era ya-nev hy aew | a 
1 ; ; : - 


es 


jens paneace vilsgeu at ot .1aVaNsD) -slMakngisiske (ad od Sawyer a 


Pda agri s e318 slese bon pygtso- ic egaerio web ‘ - vero 


etadi gaye 2 


nisuiveseid Go suslo ett aacitesh isbom abit ae , 
gatxtetdee Feeedel A (od bia ie ye mont 08 


otis Hs 2 


° Ke ess ek @ RS © hs - os - 
+ Geom: eargend Latta atid ——s eet ati saiel 


15 


variance and covariance implied by the exchangeability will be denoted 

Z 2 
Doth. oO) and: ©0600 respectively. The first and second order moments 
of yy, are given by (2.5). y, 8 themselves become exchangeable in the 


following special case of Model Ee 


Model E 0° The special case of Model E. Such that ane % 1a be = Or 


for ral. kee 1 ss 

Let us now consider the discrete exchangeable super-population 
model. This is mostly known as random overt or random labeling 
models. This model was first used in Madow and Madow (1944) and not 
addressed again until Kempthorne (1969). Recent works on random per- 
mutation models are Royall (1970a), Ramakrishnan (1970), C.R. Rao (1971), 
Godambe and Thompson (1973), Rao (1975) and Rao and Bellhouse (1978). 
Under the random permutation model we assume that N population values 
of ZY are fixed but are labeled at random. So that, each permutation 
r= (Ly>+++ sty) of 1,...,N is assumed to have probability equal to 


1/N! of being assigned as labels for the units. The equivalent statement 


is that the fixed but unknown number Yprre oy are assigned randomly 
to units with labels 1,...,N so that each permutation of Vy 


has a probability equal to 1/N! for fixed labels 1,...,N. Under both 
versions it is implied that there is no systematic relationship between 


labels and corresponding YE values. This y-value corresponding to 


h 


the Ke label can be regarded as the outcome of a random variable 


Ye Let us now consider the following general model. 


Model ERp' (random permutation model). The class of distributions € 


such that, for any fixed, unknown numbers ZprreesZn and for given 


N 
numbers a. =O. [arid Die KOREA oe cto N such that y ae =a.) | Se be 
i 


ai a 


et | 7 

havens od tte ysti Vasil sift ed debt pet een el 
eansaen “abyo finesse tae 2852 aff -.plevksoagent ay i. ee ut 3 
ey ni uldasgiadoes gato utd asetboeietl o.47 . (2.8) wa vol mate ge 
7 tube 16 7 Laken 


= 


3 


OO" ¢ tf = ye sed: dove SPOR to soto Istooge Sif : 


¥ ) 


2 


eevee = at ; i 


Mie uggs . ol ae SILO e 2 Siar oe lp ai), yehiageg Wor a Se > 
Zé ' i 

; ; 

Sirat sosl mobs Hitet.299 PObESd) en me sy viavor .c) aiheit 

P i 

168 256 (SSEL I BOoRM bits \wobell ab Reed 2ack2, cine tibiae sir ; 
—Jog Mba! OA Esov pad |. OG: einmeuisiones Tis noAee 
(MANEL) OSH: Hal » OPEL) aermied Agel fever) Lisyo% one 21 oben = 


vel) section ts pba ber’. feyer)’ gar OX?) iro aio 


oe ‘ ye ee -) 
ri i gy 19 2 36 i rf ] j 700; SRUEAE WwW. 5 So wo2gss6 eS mobiaey pre 


‘iD 


oerseswerrsa: dias eteits .. Sy bres 15 \beledsl. sie sed) hike wy 
23 lanes yaktidedoed overt c3) hamper ete... deg bate ees ” 

1 ASM n9% toeloviips ont serials ral atte 2 EB bangles wt Yo 
imobiet heuphees ape a... yt Saclay rrin 6 Legh ails: 
ghee rs ya” Hie ra eae fk a2 
ised vent hi cock BL agit Conse p53) Su éa Latina jc htetceaa hh an 
Aarti Ueto lieser, Shismetaye one éveta Sad beh iegst ac 3t weiss} 
09 Sethooqueessos sulsv-v etd’ ‘eS? eR gathitogdersos bane. etsded 
Side try mfauany & 20 ensito ‘end as aidan $d SB dordes i, aS 


LA 
-ishow [ntsnen gotwallo? <3 Th FentaJwon, av jal” ” - i. 


i 


3 anotawdtsiett to exete act .(Sebom not screnrsq nojbtet:) ‘git 

meaty 10% bie. ey. “eye ‘ncfiaias Fees Lal vfoent aes? pans dove 
oss i e 

Zoe te em mend gk ange ond 


‘a abe i me 


ay ; Lee 7 
a 2 wl ag " ets a lia : 
a ee “Se « oo 0 an 7 — 


~~ 


ez” 39 sOviotimier and owl 


7 
7 fi, 
a 


16 


random variables 


2 yt i ba, : = Pe ON 


have an exchangeable distribution such that 


12d Ai MU Ag oy Seay Aa oT AW 9 OE cam RY 
1 ty ty 


for each permutation Tpocceoty of  iLyssj,NS Ther Model ERp implies, 


for any n, 1 <n <N, the marginal distribution 


k i 


a (n) 
P( melee kat, ymapice)) ee SPUN : 
eG aria: n n 


(n) ‘ 


for each N =n. @ different sequences r oot) of n numbers 


bie 


chosen from Zyorcesdvs where the corresponding random variables are 


isk: dieeiLs for a fixed subset of labels k 
Us Kn 


for the Z Bares for KS lyiece Ns 


»9++e kK . The &&moments 
if n 


E(Z,) . Uy 
2 Z 
(2.6) E(Z-y)° = 
ah Hy C7 a2 
ih ZS, 
E(Z,-u,) (Zu) siren e kerr ous 


where the unknown L, and a are given by 


N 
a ee 25 tL ewy2 
MB UOUN ) ce : oy N ) (2, uy 
iE af 
Therefore (2.6) implies 
hy Wer EG) Ore auuhatebs 
Di ak x. 92 o42 
(22) Oy é cee 14, ayo, 
2 
-aao 
2 k 2 z 
— Y fae. i 
Fries 0 a Lega be 


It is to be noted here that in general yy are not exchange- 


t r 
4. a a 
\ 
Pi Li Z Diy 4 
a gage <2 
ri 
> AL) # Oo a wie 4x) 2 
5 aera’ mobaet 
. - Ay! eye fe - *£ 
i 


6 


haoqes rina: Se 4 Stauth Fad ‘os bad de ih 


34 \efodel 6xendhee hie ti ah 


eo) ae caeey 
fans yitee. aoltordincnelh. ale 


Os ER 


rane Leaig sinc att 
' \ 
B\ i . { . = NY © 4 iea@ - = 


c tt } * 


AS Ope 2 (tues re i | tag 


& 


is7 


able. But, in special cases when ay = 1 and be = 0 for all 


k = wate 184 i 
Tr; sNa-sorthat Z k in Model Ep? then Ye are exchangeable. 


Model ERP F The special case of Model E Suchethate “an = 1, bie =s0, 


RP k 
ie y Z 2 ab ns 2 
£ Kee es = = = =i : 
or Ly..5N, u, = Hy = } er cu O75 } (yu) 
So under this model any one of N} permutations of Ypres is an 


equally likely outcome of the random variables Ypoceeo¥ye 


Parametric Super-Population Models: Usually in parametric super- 


population models the joint distribution §& of Y = Corer ty) is assumed 


to have known shape but depends on the unknown parameter 9 = (8, o02+38,); 


2D 


€ 0 , the parameter space. Let us assume that the distribution S69 


is continuous and g(y/9) is the density of Y. 


Model G., (parametric independent). The class of absolutely continuous 


ut 


yor? are independently but not 


necessarily identically distributed. Their joint density function is: 


distributions, —: “such” that: -¥ 


Zz 


g(y/8) = 1 8, (y,/8) 5 GeO, 


et 


where g,(-/8) is the density function of Ye 


Model ED (parametric). The class of absolutely continuous distribu- 


tions §& such that Yyorreoky are exchangeable; their joint density 


function is symmetric in its N arguments, being g(y/®), OR EO% 


Model For : The class of absolutely continuous distributions § such 


Y are independently and identically distributed (and hence 


that BOSE N 


vf 


Lie sot 0 fbi, = fe “nga nodine Laboage 


sldeagradayn o7s yy. oseas . aye fahort. mz Ps = PS taa oe Ry; f : 


__ 7 
> 
= = 
> 


) 4 


ie 
er ‘eon 
[shot to cass tekeote ont " i 2 
° 
Ny f é ral Z ; en 
LOG tl, Ae. eee 5 DT > £Y q o . y = 4 Mg 45.948. = of, eS 


4 8 ) 

) | . 7a 
me ab oWes vse ct Yo Seobiasatag I Yo eno yo Labi wkds,? E 
(ps...) esidalilgobper Sar. 16 stoagme tethers 


‘ 5 


u 


/ : a 
ogve ofyismaie¢ at wllere) | sSkehoN ims te inads-r sae ores 
a , See Pes sy . ’ 
BepHees, ar yC e-em) Sy MFG relia dpe b tptat ors 2lsboer’ aotrney 
7 ies : 
1) = 4 gSyemessg cuba o> 9p. Soe Id ages ovo “a 
y ‘ ae) om 


udivsats eft Jods amen, Seided «aang easehetag ot) . 5 Cae 
. 


i © 


my 
i 

ty 

a7 

hon 
VY 


oe) 


4 lo chemi sade! > OO\ia > bhe-eroeiim 


slountiooo visidtoads Jo edéelo SAT (SaShgedshnt Slayer) atl 
Jon td eionebaAcqoehat ore yf, ae Jerid wisize - arotsudhraee 
vé - 7 


ai motsoml Vlensh ontot weit .letedtjedh yitasisaeht yiiaaa 
* 


i wy 
: “a 
Ye : 


. 
7 


| if . 
25%. ) (O\zeh ow oT = (O\ee 
> i ra 


«ji Va agbioma vedeneh afd ef t8\.) 9 sisi 
‘ ao 
hy 


Welsyeth evcuntsaos yimiyioqgte Ao. esata oA . (obra scnxeq) eee 7 
7.) ee 

4 > 4 * Fi 7 7 1 y ' . 
yatansh Jetet tists zolidasearithxs ote (Peer ngh seit owe 3 nas : 7 
G38 ,CO\W)s panied setasmgw % esl AE ops 3] orp : : 


a j — 
; 7 7 a) 
dove 3 Hossuciigeth eveuntanes, (leselogia’ to) seate set .* 
95050 leg shia “eb fpsts qaht tim UF: b ai 
- : Cave ee | 
Felted — : 


‘ ye - es 


» j ® cy 
=> bd » a : 
a - a6 aes gue 
i <— g _ * — -_ a * 


18 


exchangeable), their joint density function being, 


N 


gGy/8) = 1 g(y,/6), Bie 'O: «, 
1 


where g(./9) is the density function for all YE 


§2.3 SOME DEFINITIONS AND TERMINOLOGIES 


In survey sampling, we mostly deal with prediction of 


N N 
= 1 
population total y = : sa or the population mean y = N ) y and 
2 1 


their standard errors. For inference on y we need data. Under the 


model based approach, we have in general two sources of randomization 


in data. We can represent the observed data by d= {(k,y.); k « s}, 


Vy 
where s « cd » the set of all unordered samples and yy, € Ry for 


all k=1,...,N. Data d is outcome of the random variable 
p4) = fk, Yap) k ¢€ St, where S is random and for each realized value 


Sou OL eS yy for kes is random, also. At this stage we can define 


two more random variables, namely, D, = {(k,¥,)3 k € s} and 


S=s is fixed and in Da» Y ye 


is; fixed fork) = 1,...,N.. So. we have ‘the sample space ‘of m) taking 


D, = {(k,y,)3 ki ecSi ssi D 


value d: 
HK {dis eed yeu; 


where, usually 2 = Ry? the N-dimensional Euclidean space. 
Let us define the statistic T=T(). If S=s is given 


pores only through Ye >» ‘Khe SeaeeOn the 


N 
other hand,if Ye = vies koa ck N Ls. fixed or given, then. -T 


depends on only S, i.e. T depends on the design only. Now if we 


then T depends on Y 


~ e 


2G%sd Olson qakensh Skbt shent:4 
Ce. are 
: 2 p <4 se i 
4 7 4 att \ygoe : (Flay 


‘' ite sot gebsoa vetanab wit “al? ey 


2aTaOu enUeny'T ta snonr nad a 


7 te- Goigathsrd..#4. @ beasb vit om Qu on Dame re ort 


; . x 
Dt "val i = = SRM (Oo. aboyse gait so pe I. rae! Lesa Sal 
é i 1 re 
7 7 


att sepe) SIsb bosn ow % we Soneee Ue NGF seated ubsabe 


\ 
ideviameonpany To esor70ns ow Le apoy ul. Svar aw insozges HSB 
7: ai weal) = Lb ) ’d: eiebh barre gdavens adn nerbop: ~ we ’ 
Oo Tue estate Sershr er 4¥r ig dud ony ae ie 

Sineiiay mtbdar ada tod shesjiue ve? ~& sta re | + 

evilev bor Rises todas 12 bre s0ohna* ae e*. statw Ae 3 er) 
*HAtBD NED WW Sees elas + ‘OHTS mops ef. 4 oak sale * re 
i sea bray mohper are 


i oo G. mab Siti BSess oi) ee =) PR Th ci Ne | ete)? 4 


a 


a4, @ Cp. : , 
qe tin & Sp 2aQe si'qmss odd sed o.d3 ee eo, er 4 yo} bexl? 


ev thi's 4 Axe'y b) —, 


5 ei ne 
308¢0 agebi ion’ J sopisnedth-¥ ata of "8 giswes sable Wy 
. “le” - 


gehen nN (ROY =? ‘oBeekiest ary sacish an eu oa - 
\ ; {’ 
at mM ot or saintly ao ghee rapt 0 abunges T sorts 
T n nj « 7 RS? 
ts marth x meses Wernrak =a “eX ~e 2 Prat x 


oe ir aa* fey ybr 9a Mili eaniad J ae cy 


. 


/ 
> ad v1" 


: a . a Aat 
: cial 7 e : ¢ : ‘ ns —_>— 
“a : mas, 


sey) 


use T() for inference on Y, then we shall call TGO®) a 
predictor or estimator of Y. Hence we can replace &) of the predictor 


TGO)2. by D to get a new predictor T(D,) for Y. This is still 


? 
a function of random variables Ye? k € s. We shall often use simply 

T for T(M) or TQ,) Hust to indicate that: T -1s 4 function:of 
random variables Ye where k may be in S orin s_ respectively. 
On the other hand the random variable obtained from T(J) for 

Ye = ? kKo= 1 ewe, 25) Lixed but. kre S$: will be: written -as t@,)- 
The: value: of T (A) for S$ = s. and Yee = Vie? k € sp will be written 
as t(d). Thus t(d) is no longer a random variable and will be termed 
as the estimate or predicted value of ve As above, we shall simply 
write. t > £or t(D,) or tid). oihe small letter” t will indicate: that 
t is a function of the realized value Yy of Ye Maas Kees Re 
kes. 


Ele wLSeravPreGiceoriol Y then, let us define p-expectation 


as, 


E(T) Vp (eu Ta. 


p-variance as, 


y p(s) (T-E(T)) * 


V(T) 
and p-mean-square-error (p-MSE) as, 


MSE(T) = J p(s)(T-¥)" . 


It is to be noted here that E(T), V(T) and MSE(T) are functions of 


a tQeT too (idde es, i peptone 
a aia te B. sueiger aes ow senda a 28 sesmiaaie o = 
tives af eat |. T a0% Ce 1032 ¥oeay wea s oR of en Be 
viqmba sae negte linge @ se, SE yt astdatrds soar Ne At i 
760 goktom? B et FT «sa sant bork os dev! (or 7 or 
‘elovitodqes)..e¢ oF to ‘2 we pis Mian é; oie Jy » quan 
| Ot ET word benttere a darn wma air i! (ena 
«(eid3 6a. cesses ac §fl baa a beats at eee al a 
negstte ed ffiuw a.» ¥ “st * : ee helt “ eer 
vangas 4d iiiw bat eideeriy oh nek tego” rote orl wate 
tke Lisdelaw ~svodr ek 9 Ge enh ia abhary ty oe ‘ 
shto sdabalins NEN". oY sbhads Bighee OP Lena! ere or adic 
i 28 Aedes aiet dante ont te’ : 
ery 


nvtintosqxe-Y sntish au def Jiols 7, tp Feastiere 'é eb ga 7 
; ; : 


ae ee (elq f - lym) ae ns 


random varaibles Toma T es 


Definition. T is called a p-unbiased (design unbiased) predictor of 
y. if and only if, for a given design p, E(t) = y for all 

y= (Yy>++ + 2x) E Ry where t is the realized value of T for 

Vie = Yo k € S. The strategy (p,T) is called p-unbiased if T is 


a p-unbiased predictor under p. 


Definition. T is called €-unbiased predictor of ra if and only if, 
for any distribution €, &(T-Y) ="Qe ‘for altiusue ob where € is 


the expectation operator with respect to &. 


Remark: A predictor can be p-unbiased but not €-unbiased and vice- 
versa. For example, if p is a simple random sampling plan, then 
under the model CG? T= = ) ur is a p-unbiased predictor, but with 
N 
y=) is 
if 
N 


het oN : 
ec , ¥, — i) = BC ) x, aa ; 


Ss S 


and hence not &-unbiased. On the other hand the ratio predictor, 


N 
7 ii ane nee le) x, 
s s rf 


is not p-unbiased but €-unbiased. 


Definition. T is called pt-unbiased predictor of ve if and only if 


for giveneep and €&, €E(T-Y) = 0. 


20 


> 


te rosatbhs7q (pegelthan eee heeordbyeg, atkins al T 
anf 
lia so. © = (278 | ig ngzeab opens a *o2 Ti idan Bs 5 


tu? 6 J) (eo Ss suiny boviisom By ek f pore «ff? Kyte eee 


al T Wb beastday~, belles 22 (7.9) gpaverit re Zs Hs 
_ 0 yebuy soto sieges bug 


4 
i | 
ti gino hge 4b,% 20 192s fbaxa bei dink +9 belizes er fx 


ar SR @reav chs. ~2 Iis toi Oe yy-r}h ae nobtvdinsekbily, ’ 


mam. Od tnayes4 flu zorszeqo Whkwed> 


a 


ety boa beta ltiu-& Jon sid beeckdmeg ad neo tosokbave A 


rows yma th atl ames mobass Slquile se ato +S <9 set 
; 
i 


(kw sod: s0jotbang beesidini-4 5. et re Bigot T say tbe sa:4 
. | ee 
x 
7 a Jey oat 
er ee a ee a ce 
na rg J si 1 i an 


tosoihesy Ofisa ada band asi30 ed) 90> sheaetday-3 36m Soned bas | 


E 


au 


Definition. (Lif qT and T, are predictors such that for the given 


design p, EMSE(p,T,) aa EMSE(p.T,) forvalie.ere ¢, a given class 


of super-populations, then T is called at least as good a predictor 


il 
as T, for the design p. If strict inequality holds for at least one 
be he >» then T) will be called better than T, 


Definition. If (Pp, >T,) and (P,>T,) are strategies such that 
EMSE(p, »T,) SS EMSE(p, sT,) for all &— « @, then we shall say that 
(p,>T,) is at least as good a_ strategy as (p,»T,)- Tee serice 
inequality holds for at least one —Ec«@, then we say that (p,>T,) 
is better than (P,>T,) 


O 


Le ty 3 t, @,) and t, 


p sre? m2 . =, 2 =. 2 
i E(t, Yan E(t, y) TOT Pala any s< Ry then Cet Y) ae Uinies 


= t, @,) are estimators of y and 


for any super-population model &€, where qT, and T, are predictors 


of Y corresponding to estimators ty and t, respectively. Hence 


Lt ty is at least as good as ty for estimating y; for a given 


design p, then Ty is at least as good as predictor T, POrvany 6. 


Lemma’ 2.1. (Cassel et al, 1977). Let el bevany predictor of Y. For 


any €§€ and for any non-informative design p, 
(2.8)  EmsE(p,t) = EV(r)+E( BC) 17+ V)-2 ELECT), 


where, WY (T) = @(T- Ect))* and B (T) = &€ (T-Y) are &-variance and 


t-bias of T respectively. In particular: 


(a) If T is p-unbiased then 


; re > Wee ft a mat ¢ 
neva sf7 me ade dave e003 8bNa BTR ge! brs 5Fs ay 
s2el> MVE. B 2 ae ee 8 qe {Dead kal ey 


ai | 


rodsiherq 6 Dong BE y2boT' aR) bettas at rt naddls -oncksab 


~_ 


ano dees! as xO abTod Vy tlLapant a tnta +7 Pa. “texte ol nes 


7 pen Ogata Wettss ga Site E yatta 


a 


tem) dove esigsiexy12 Sta C Pega bre ies 3 7s »Boes 


Sad9 yee Llede swiasit . ~ Oia 3 | Lie aa Raat 7 kn Peas § 
; te? 


fe Sp aa am 


1? ity, 


CT. 6) 4ed3 yereow ashe , Sa 3 sno Sael oe 103 shioi! 
A 4 


sotrse id (24 0) » "ae asset b is. a sal 15 et = 
. Cy heet) Tames. rose i 


a 
boi + i¢ g@xyobimtsde axe (0). 7 =o bre. ( Ae > rs +} 
~ « ” - 4 : 


(-.7) 13 > SSE) es asitt-y 8 2, yo Lhe x03 37H axe * Bae 
a1ot- hada a8 gt bins ro s7Siw os tsb Oi. ~inieaa 


ie 
7 | i Pe 


Sesh 6. Visa VPiopoess co hos ? wn Pew a ) 
oo) ae | ses ee 

asyig i Tol” WY Bnigmettas ol) 9 oes bog ® Seat ae ak pp 

-— ’ a 7 
> VRE Tady, c sow stherg a8, ayes ae fons i dt ae i a 7 gt 


7 
a? 


7 _ 
A, 


a 


’ 


J. 38 iorotbesy tee ad Ty 36% Jensbi ss ame Shea 


| pa heass avi 3encroint nog qua Hee bua + you 


| Tereriati-hy ies (xyBiss CoM ae heap aaes 


a — 


iat suneixay-3 o7s (F=119 ad pat &. bar ‘rs ins. oy a 
a et al a = 7 any 


ts ° 
+7 ee r , 


22 


(2.9) v(t) = EV) + Ar)? - WH 
(b) If T is p- as well as §-unbiased then 


(2-10) EV) Hk Ptr) — WT): 
O 


In the next two sections we are going to discuss various 
predictors under design oriented super-population model and design- 


independent super-population model. 


§2.4 PREDICTION UNDER DESIGN ORIENTED SUPER-POPULATION MODEL 


Since the publication of the paper by Horvitz and Thompson 


(1952) the estimator T well known as Horvitz-Thompson estimator, 


HT’ 
is considered in traditional literature of survey sampling as the most 
attractive estimator. Though it possesses some good optimal 
properties, but after development of super-population ideas it has 
lost some of its attractiveness. The Horvitz-Thompson estimator is 
defined for any arbitrary design as, 

ve 
7 


(2071) ee ; 


where a is the inclusion probability of unit k = 1,...,N.. Basu 
(1971) suggested a modified form of Tat which is known as the 
generalized difference predictor and is defined as: 

Given an arbitrary vector e = (Ee, 9+ ++9ey) and a design with 


inclusion probabilities Oe > 0, k=41,...,N, the generalized dif= 


ference predictor is given by 


~ “peneiys + ets < vs 
asi? beealde-3 eg. flow an -get 2 3 (H , 


; : - - 7 
(Oe + Fs - wr 
Q } | 


‘ 


enoliov eavsalh «2 gatog = av enottaua ows yen wits nl 
— bar lebum rokse toenail besiist a6 ogtesh (Tebow ' er 3608 


Sbbup peaatmog-aegnn Inabsegs 


SIM BOLT AIUIOT~ SITs axraatso. wiles Saget MOxPOIOET AS) 


= 


oomgeor? bos stivyol vi reqsq std To mbhosotidny esto sonete 
gy? TOMmatres ots re 4: 
i 
290m sto #6 gollomse vowjue to stutaiagid Pagottibas) at eisai veal 
\ \ 


Lasisgs. iow sion. egeeenueg xt PUN ORT -T92BB RIES or 203 
en?! 34 asebs nelseiugog-seave (16 dtu gp Lareb Terie iné semis 


Ww eeives noageudt-—s2ziviael as tatmnal deve : 


‘et Teoma rézgundt~eties08 okt mene s est 26 meow cd 
«Pa. mplogt cies qi 203 bantiab 


i ees | 
' mit we aati 


vee M,...0@2 alls Neca Ra el at srsde 
ott am awors sit slo bite ia 30 spo) boltEbow.« beteugawe (£0@L) 
Lap Santen at sas Toss therg Soersttth hertieionsg 

daby sgkeet 4 bas Geese =e Tots txesatdye ne nevi 
7 Ath basttatoany ons Mics te ni Spo 


23 


Y -e 
Re: = 
a2?) Tene, } te , 
GD S Noy 
R N 
Where) a (=) iinet Nn 
k 
ik 
The estimator Top has the following properties: 
Ct) Top is p-unbiased; 
(ii) Top has zero p-variance for any value y of Y that 


satisfies (y-e)<a = (A sees), provided that p is 
a design with fixed effective sample size, n, 


abbreviated as FES(n);3 


Cit) Top is —-unbiased for any model if Qe. sane. 
Kena elintersverg hs 
(iv) Top reduces to Tat (aye ait’ e="0" or) 7(b) Qe, = Os 


kee bye etd. per is | FBS (nN) -desien: 


(v) Top is origin and scale invariant. 


If p is an FES(m) design with oe ZO Ee CoP Ue ce in See tT 


EV Gr) is minimized for the choice Qe. = and = fa, : 


Mk 
k = 1,...,N,. where ~£ = n/N. So, the optimal strategy of type 


(P>T Uy) is given by ee eee consisting of 


(i) any FES(m)_ design, 


(2.23) Poe Pp) » such that ae fa. ; kee ee eon a Ni es 


(4i)) ther predictor, 


Y -b N 
k k = = ae 
= - = — b 
(wpe esa } Na, oe b * d 


ois ed3 sous age “a 
+ 


“+ Rwingi dati et at. Ed ‘ 
| ae >, ot ye 5 


03°F io. ¥ \suisy xs bes salgtins WSS hi, | | in} j 
af ¢ Jana bsblyura ity Me nig's f oy a) Mad) piaa hase | 
sf ,9a22 slomsee svisoeti« 2 tid br isMolrStd 
‘(o)eHT) | Oe hereto 


: } ~*~ — * 

« SS Le FP) tesbom vee 108 deren tL a baa ; a 
ay. Seal «7° epi: a 

i 2 


nT a 100) sat ae BE Aa age sea ae aye 
kT ed eob (a) et) et q) 6 Magee ae a \_ 7 a ; 


‘Sentbsead led¢ i” be a wy 


nity Usardpia@ 2" 20 “ft Fie a sa (lea 
t 83) hap, at bg Pe eae 


squz f, taosente thats a ee . Me: 
‘one aa a ae) 


9 aes CE a 
1, Reade OS ee sa ane ep 


24 


The predictor T fe has the following properties: 
(i): ZT A is €-unbiased and pé-unbiased for any €& satisfy- 


ing model G_ and for any FES() design p. 


r 


(ii) Topo is p-unbiased under p = Py? but for any arbitrary 


FES(m) design fF is not necessarily p-unbiased. 


GDo 
Let us now consider the predictor T such that T « x. 
the class of all p-unbiased linear predictors of vas Hence, E(T) = Y 


and: T.- is: of the form: 


: = + ‘ 
st) * Wos ) Wes “k 


Theorem 2.1 (Cassel et al. (1976). Under Model Co and for 
1 ; 2 
th 


A= a 


> 


2 
eel jiClH ia) o 
(2.16) SuCaeria CoD ro) 


n 
for any strategy (p,T) such that p is an FES(n) design with 
Cer ree. TK =e das Se git Pate ee 2 equality holds if and only if 


(ty = (prt) 


0” GDo 
O 
Remark. A strategy based on the p-unbiased predictor Tat as in 
(2.11) can never, under model Co be better than the strategy 
(PT en9?* 
0 


In the light of above theorem, in general, under model Gr 
it is advisable for an optimal predictor to use a design giving large 


inclusion probabilities of units considered by the model to be highly 


os 


: teadareqory xe 


—~wt2lsee 3 ee 702 heap rday-3q bas pouakdiu3 at sai te 
- kobom gn 


- * 4 i aed a . i" “ 
rel ILE3S \ ene a>. 39d was Tebow bewnt dqu~g at ; T (it) vie a 
-baunrém)-q vii tyseasdsna ite Sf 7 ‘gakteb (2) BER Oe a a 


Se 


oa veoh (2je83 - vas thd ‘bde 


oun 
: { ; 
Tah. te) is lien 2 rod pase Sad) Tob bines hati a0 ded 


f= AT)S gecash» .% in arctsteesq: seek) ier tncds: 0 Aid tos i 


pare wt tb! aunt 
ie) 


2 
9 é . i 
a = thal wk (Cs « 
. Co @ ee ra : 
7% qm 5 i 
: yak 
ri it éy fy ov i) arin . CoV Gr) . in y a J Shaan) a. 
3 ae i 
® 7 4 
ye 
er et ee r 
Toot eet ee ae stat ‘ 
T i f 


W mates» (nj2g% one Sto) cedauldie (Tk yepaiersy ae 
tf vino) Saa +4 abfov votinvria ‘yo p | foam Neca wg <a te 
at Pen ilo? - a) 

A ea ny? W0ISbagG bsasténd-¢ ent no bewedagkaokerelw. cami 
“gsarrse end rgd rire ed <i) Labom west steeoH 33 (44-5) - | 


D | at 


= tehon yotan ea nr ee Syvdde ta eld 
opeel aiivia: smiesh « sim o3) heise Enlt 5>ae e ee 
. dat odsat fotem atia ya eee men Aa esigtiz 


25 


variable. However, if all Y are assumed to have equal variances, 


k 
that iss; a. = Peoria. ote ay lS. NS then (Poth? is the best 
strategy, where Peer Pts) is such that oy. = Pe 8 for all. ki = AANai Ne 


satisifed for example in the case of simple random sampling and 


C227) Tho = Yo +b- b, - 
the well known difference estimator, where b = = ) bye and 
S 
(2.18) €ve.1) = Gwapd 
é o «Do n 


Under stratified random sampling, we know that if we have 
optimum allocation, the number of samples to be selected from a stratum 
is proportional to the variance of that stratum. That is we select 
larger numbers of units from a stratum having large variance. So, 
stratified random sampling is also a technique of unequal probability 
sampling for wnits not in the same stratum. In Horvitz-Thompson 
strategy we also give larger selection probability to units for which 
the variance is large, but here the problem of optimization was attacked 


from different angles only. 


) 


We have mentioned in Theorem 2.1, that the strategy (Po? Teno 
is optimal in the class of p-unbiased linear estimators. But if we 


have an exchangeable model, then as mentioned in Theorem 2.2 below, 
oT ano? is also optimum in the wider class of p-unbiased estimators. 


We do not have to adhere to linear estimators only. 


Lemma 2.2 (Cassel et al., 1977). Let p be any given FES(n) 


design with Oy. >-Oye k= Toe N.. Then, under Model E> 


2 


ancialer tnt: nosing. llega tae md Sie it, 
tend gaat ae See Lin x92 oe a 
Mins 3 a fhe ot 2 * p dade due ob Gog aa sredw™, 
bes gillqnes sobeet alguie doe o2n9 odd dh ateues tet: y 


=a LY 
. gi ree, ont 


bas od “ “¢ sx$ily ,4odentses sompzst ith vont £8 ws 
3 ‘ ~ Gg 4 ve @ ERS 


sah) Gaceet> a 
one Chie aS 


\ 


eves ov 22 tans vor ow pen! i poed mébaet balittarza vein 


SuISTIe a wOIl betoelee ed as gsalarke _ sSdagn sty eo LreaGl Fa: 


| 
melee ew al ase] tardex3 tela So sseprrer aid a7 Ise 


‘S2GAtHY Stel PANES WUE = mond exer de ee 
vellidamry tre tvledis to supéities) 6 onls- 2 enitqnse oben | 


. o€ 


moncquraufl ~sutviroy al @ngattIe aos ‘ene 4h ton ahem 743. yg 


Goide 20% “3h 94 eerbicsdory noasvaad 2& tagial oe onls or 


ty 
botoat7a ase wm ts beta to waldorg ats ered soy) sural “ot sonst: . 


ea eo Lye sooty a 
oan" y) wyelarys oey feds ,1.° mevosd? ct hooold tem, | wed oy i) 


- 
ow lt jut .esodamites seanit Kesar san ob en 


mOlSd $.3 wvsvstf 9) béuns idem an cats rishow Sidmeyinrome os 


<erojamites besctunmy 36 Geald sabi oi) wt aiettqe oaty ob Cast te! q a 


- -. vc me a a so 


(ES% mate vw od . pias =(vVer sole Serta) 


26 


; 2 
(2.19) €x(r-i)? > GEC =H)” = ens , 


for any linear or non-linear p&-unbiased estimator T of Us equality 


holds if and only if T= Tone’ 


N 
Theorem 2.2 (Cassel et al., 1977). Under Model Ee det ting, .Ay= =) S, 
SS UEEEEEEEEEEEEEEn ie 


_ Q-p) G-£a) 0” 


(2.20) EVO CCEv@ ot.) = . 


Tor any stratesy (p,1))\ such that pis: an” FES (pn) “design. with a. *>"0; 
Ke" leery ht and cf the class of all (linear or non-linear) 
p-unbiased predictors of Y; equality holds if and only if 

(on T)ic= @o>Ten.)? where Py angst are given by (2.13) and 


GDo 


(2.14) respectively. Oo: 


Theorem 2.2 is also true for random permutation model ERp' 
An extensive investigation of random permutation models has been done 
by Rao and Belhouse (1978). Using generalized random permutation models 
and general class of linear estimators of finite population mean, Rao 
and Belhouse have shown that many of the conventional estimators are 
optimal in the sense of maximum average mean-square error. They 
investigated optimality under the following sample designs: unistage 
design, stratified design, post-stratified design, double sampling 


design, sampling on two occasions and two-stage sampling design. 


§2.5 PREDICTION UNDER DESIGN FREE SUPER-POPULATION MODEL 


Most of the survey statisticians, who believe in the model 


based approach of estimation in survey sampling argue that p-unbiased- 


A i. | | oh 
Yiilpops ;) To T s6tewljes bamkkdatimgg vesnil ede 0 Ggemht 


4, 
i] 


P| 


.Tesali-nor t4 ment) Le orem sa? * of “Sne Myac we 


suaputin samgeasb phguak wabwal to afd shun @iEnehtqe tesept Taawak 


anttinae' utdbbh engéned"t 


rt. (OSD Bsai4 TS) ee sbi: bes tiaz sien, aarted., AER, ro 


ot id it ino bos. a na 


‘ : > oa . ¢ 4 
A gnitis! ..5 Tebot testa Ver , te ae Faken) 
- ra) y 


hm 
e = = 


ie aan eS aa £ (aw 
: 


dgtyw m@pteath (7) c8F) mek g eget aes Chad) wowanete 


2) (ino bos’ TS edie eihtedes. 92 ae. scree iad 


2» Lsban! go53 reer rohsns 192 Signs a at css et 2 vy iy. 
marl, fe Slabee am pi4s SI: ot; MORAG 7 rolonatiee ge 


ye 


a oe, ; 
| Aiea ae 
. 7 | 

baa ih iD 


thane roazpt 0990 opens Ao, ssodapl tas tainihs Dal eal 


20750) Yes te alevaee, eed. a youn ae aioe sere 
gedit) sateen Sesbih tebe meres iste 3b eng ite 2 sme 7 


Pagb ain. Sntesb hort. wegen. a7 
-sgieah BoLfoms aR Seat a eBDHGn0 taal th 4 


ev 


ness 1S an unnecessarily heavy restriction, and instead §-unbiasedness or 


possibly p&é-unbiasedness should be required. Thetr opinion is, average 
of (t-¥)? with respect to design p is a matter of presampling interest 
only. In this section, we shall deal with the design-free model based 
approach of prediction. Here the distribution § is-the essential element 
of inference, where s is treated as given, giving less attention to 
design p producing the sample s. So, our object is to choose T, ‘ 
for any given s, to minimize é(r-%)7. The average with respect to 

p is of secondary importance. It turns out that the predictor T that 
minimizes é(1-¥)* for any given s is also the predictor that 
minimizes € 51-7)? for any given non-informative design p. 

Here we shall assume that super-population distribution ae 
depends on certain parameter (or parameter vector) @« 9, which is 
unknown. Once we can specify ane the method of prediction of Y 
becomes a classical inference problem. 


For an arbitrary set s « ed, let a = ca 9 be the marginal 
b) 


distribution of Ye +e where ky Re es Seas) is an 
i v(n) 


enumeration in increasing order of the labels k «és and let 


E ma be the joint conditional distribution of Vie? kere 
s/s s/s,0 


(taken in increasing order of k) given Y Se ~apbe ty the 


ee k 

aL v(s) 
corresponding density function be g(y/®), 8, (x,/8) » zn (y,/8)- 
s/s 


Note that if &§ is an exchangeable distribution, then’so 


are —€ and € ~~ let E, € and om be expectation operators 
s/s s/s 


associated with ¢, Ss and &_ respectively. Now, if p is non- 
s/s 
informative, which we shall in general assume, then the operators &é 


and E may be interchanged, that is 


vs 


1, Geapheustdin~2 besrent fas not soiree weed. bes a 
epaxavs ,et eee rier” 


ae ot. hivvadez 


tesms9a% gil guaeess 


to 


ae 
Thiter 6 aE vg pares. ips seaioen aie 


i 
barad labor Aszi-ugtest add A3Eo lesb mak aw HOH B alia <1 he 
jaatiaia le 


a 
Snpsees sda uf 30 noxsoBreeeep ks pon 
oy NoiudSsss aebi gntvyis 


bs - 
ootigaberg' Nei a6 


: Py ie ; a 
fray Re Heiner? alva ore Pi —o 7 
ae WoOOnS OF°ul 3 


atdo a0 "Cee vse 


atidmae ada dapasbexq Ps 
03 


lovcaot aivtwvegsisve sak |. (=0RA iaamact iia bya" 
jei3 Tf . reaarbete 45d 


ona 


spun s*pgat gahbeoose 3 
hd ery bP 
“ a4 tr: ; r Z ca ’ 
ledt: Mog tper”g of3 oeis of. = AHOVa ‘tag 36?) an - BOS, 
4 . 7 ‘| 
9 atenth evitsnisial-non asvig vas x ot (Eas 
“a a ~ 
. = 
a nvsavdl wel noLshLegoger ogue tens Sat BBS iad ‘Sw axel, “i. te 
4 = aay 
ak dotdw lila t (xaagey teddms1bg. 15) 193 shies bg ae 
Y to aObasibeta to. bodtem sit ..,- vVilosqe (153 68 apm 
‘inst dosg sonszoiod {ea ipo wa ; 
| iu 
ionlgrey ety ad 6 Ae gal te. 2.2 1458 raat ne 08 cy, 
foie G v7) rou) 
as ei, 42>... 2° 8 Sze ¥a-54, hf - 70 ‘bad! » f 
a)v {i i : 
oe 3) aat ay > om 
oat Bau a 2/7 ated 309 tp reer gales “oF E4eve 
2.24 2g¥ Aa nobaudidedh Liapta ines sufop a9) ad 
ame, aia) 


Siees<e WY feet ir. Ct a0 ih gegen 

. Ca)y J | 

sti B. At) Wie HAD od pated cutene reibe 
3 


oe nend aba bs fdgagusdows na St 3 » ae pray 
SI294erTAgo sOlEISaGKO-ad SS bop. 3 3 


r sal Ps ot 
-VLevtsoegens ts " 


wat si a amit akon 
® exorseyegr ond ody | snap Rreae a tree. ele £2 


“ior ek «¢ at won 


a tom 
" wl 
> ? 
ms et z6ri2" “ognedbeean 
ee las at 
SS Ct 7 : 


28 


(221) €vsE(p,t) = Ge(t-¥)* = £ @(t-¥)* = E enor (t-¥)? 
S75 


Here our objective is to minimize E (1-7)? for any sample s. So if 
‘ ‘ PUSS &(7-¥ 2 ed 

we can find a T* which minimizes (T-Y) fer any s€é ,. and 

p is any non-informative design, then T* also has the property of 

minimizing € MSE (p,T) for any given design p. Alternatively, if T* 

is such that it minimizes E(7-7)?, then in the presampling stage we 

can look for the best design p which uses T* and minimizes 

EMSE(p, T*) torudifferent..p. 


Let the population mean 


(2.22) eee cre veer, 
Ss 
v(s) = i 5 dy i y 
= = v4 Cage = ao agen 
Gaeta Na), sew vielaa a (ty cate Nev(ah <i ka 


realized the value y of Y and this can be expressed as 


C2823) y 


y it O=t)¥ 
fy, ( 27 


In this representation of population mean Voletne tirstepart, Giathe 
right hand side is known due to sampling. So, Basu (1971) suggested 
that attempts should be made for a post survey estimation of the 
unknown part y_ - ' But this idea is’ criticized bythe decision-theorist 
as here the ee is selected after observing data. 

Let. -U. be arpredictor® of an ; 2. then it followset roms (2.22)", 


Ss 
for any given sample s, 


24) Le=eetoy + (=f 0 
S's Ss 


is a predictor of Y. Since s is given, the distribution U = u(D,) 


. Gm 3 + Sesame ~ cmt 


E\e 


> Ht o8 ve - showed vae te? ‘gant sis br hac a? ‘a t3aafe a 

bop ts 32 Vee 03 = (yarys ) mst avian fotdw Tt tm 

20 ytzeqerq elo ead cals 2 mati <sheeeinh Ts 

“T 3: .glevideiiesit iq meEBab nats vii SOF | qe 
ow eg238 eoktlomiagye sido at ast) : tens consainie oe ae 2 ‘A 

; eosimialan bus FT aged Agave ug 47% dhob asi Ad ro - 

.% opis rot one 

£5 at tan Lugeg. gaits i 


ihe: 


FORAY Teas ie! 
f [ = 7. i tn ; a a” 
av or we re! Y oe i 3 : P Y¥ soy ys : 
Z a 
- “CR70=4. S a . ) nr u 
Gs< & 


a ue oR oe 
6b bavdet yxsesd yen elim bas. Ya. % - sbniey ane 


t( 3!) # velo « | i (OSc8 
= 6 ne ote 
2 | eee? 
a 1 ( a - : : “FG 4 
Hi] 10 I2aq Jeet ods .v) arco Gwtoniwane| to ngeasomse tqer aid? af ' 4 
oh 


a | 
Satevgnve NEI) we68 408)" -apnlemie es sat Cone a ae mt sia 


o1i7 30 ag er.0> x rive tog p tes sbasy ad tivofe 


ee i 


sehrgedd -ag) etuah is vo Daateaes al pois shdt tue 


a | " 
8b -ytlvisdic tors betoeles <t toyeutyes aie 
| | 7 


-(SS.8) movi swolfok 91 met . ¥ to hel 6 sd U sas 
e 

2 6S" woe | 

Mes + xt * 2 Pent 


(00 * br @ptsudtiateth fz _arovib ah 2 pesnie ae 
Pe oe US eae 


and T(D,) = £ Y) + (1-£)UQ@,) depends entirely on €&. In terms of 
U,..the® MSE, \ can. be written as 
(2.25) Ene De Et (IKE. )7 ee) 2 ued. 7} 
Ss ae = 
S/S S 
If §€ is completely known, minimum & MSE is obtained if, for any 


given s, we choose 


(2.26) Dime Ce x ee) 


cums 


However, if § depends on the unknown parameter vector 8, then at 
first we shall have to estimate @,,:-and: then: attempt Co» predict 

‘e Note that T is €-unbiased for va if and only if, for every 
Ss ef » U is &-unbiased for ie : 


Let us now consider some §-unbiased predictors. If we relax 


the condition of p-unbiasedness of the last section and impose the more 


loose restriction of €-unbiasedness then we can find predictors with 
smaller mean-square-errors. This is demonstrated by the following 


theorem of Cassel et al. (1977). 


Theorem 2.3. Let p be any given design. Then, under Model Gn» 


Oho e En(t-7)* = En (T*-¥)? 


a9 


where T is any linear —-unbiased predictor of Y and for any s eted. 


(2528) T* = ee + (1-£ .) (2 a_ + b_) 
s s 
where 
Cia b) 
= i: kik 
= We = —EE—EE 
(2229) Z. TEL ) Zz f: rr ; 


= ene Equality holdseirt and only’ if - I= -I* . qo 


7 re A 1 
2 c oe ‘ : “ol 
i 7 a e., tec iie: 
4 - 
‘Yo nara it 12 oo visemes ied ee Pt oi sae Fe ~ 
7 
r , ae 
Se maT ; date ag miko ‘= 
i ‘ hl . 7% 
¥ a) 
< - j A > ce = ‘ ms 1 : 
/ ie a | ee n ime _ ne i 
> a = 
. p . 
(1a°bonte3 See kivhede . pqornl vie si qdo a 
P ; ; 
} , stiri 00 = 
4 r . : 
f | % i e 


J8 Sony -,8 | 3198"! teramsreg Tana wey 40 sbinsigae : ae swe 


Dad 7 SQ | OZ Eoin p12) SRAM of, ' + fed & tad Pas ek 9rek inte ow 


Wi «sur ; vv ra a on ' 
bi, Vino bie al, 2Y se) ieee ekeintore al T seo SiO 
vo : sae 
iad ° é , an ios - 
a} 3102 Seantdaee-s af) By 
AvOS>bsrg besarte? athe reblenio ware - 


a4 ‘4 


Ms mt Tene! fey 7e yes sit Jo ease 5988 Paging xo sobs 
Other tT neo: by perdereg Tbavnd Ami) 3, ¥m" nok ideals 4 


ra : ; 
(Oi 83 YO Dobe oavehewel eb a Lorh soe sta iam 
: = a 
art 
ray 


> 
ea) 


- , 
| | ASS BED: alsa sari: " 
’ y : KS i: t 7 


: a di sa i Be ; ‘ - 
Bad nan sag Fea aay te 6 pic Ps ie ees: 7 
i ; ~ : - fg 
. a e Py ae | 
; yal ae ave Rati a4 a en rs So 
S a : : a in As 


% 5 ing a gh ibed ee 
a (ire a8 ti To — eee sea a 
~~ 4) a bat 


aT 7 _ 


rh 
n'y 
yl 


Pe 


30 


Next, let us consider the predictor T* under various special 
cases of Model Gr 


(i) If under Model Co Dee =O arora lia kreeli.s .N.o8 then 


k 
(2.30) re ety \ti(a BER EG) 7 
s N E k k 
where Ze 1s: as; in. (2.29). and zy = Y fay ~SViL pe iscan, FESCn) 
design, then 
(2031) Te er aioe Y (a,-a_)Z 
HTo N r Keacenk 
where Tato = } Y/ (ma, ) dl 
(ii) If under Model Cr» ant ie fOLraL ie “kee (Enen 
eat eee ations 
C2232) T yy +b De 


the usual difference predictor. 


(iii) Finally, under Model Gro? 


Tx = Y 
the sample mean. 
For the sake of comparison let us discuss the intuitively 


appealing predictor, 
(2539) Tie Hh TRH Om, 


This predictor has the following properties. 
(i) T° is E-unbiased predictor of Y under Model Go 
Cid) T° minimizes, under Model G,, for any fixed s, the 


-.2 : : : 
criterion E(t u), among linear §-unbiased estimators of the super- 


population parameter u = utd; hence T° would be preferred if 


vs) 86 Ce 
. : Fe - , 
: : = oe mis ar 
; batts ; or 
intosge napkrer wabew “7 - lt . Tmet fal 
soy Saves bef 
f a te ie ] 


4 


roils he he ated Ite ae}: ae TA tae 


j 
——_ i oa 
As a — 
> 


, a > a 

g? A 4 och” 
ayy 
$4890 
. ' y 

fot) <s Ie 76141 '4 2 al 
| “a , 
0 ai 2 4. Be 2 #7 
ve orb and 
- ~ “Laity sth 
as oo ay 


KLavislidye? ang nats ats at at sbetousand By sits - 


@ 
rié ie - / 


taki Bo a 
a aaa ey ip 


b eemngure. satvetres eit eat Soo 
if) Seo atipay v to rodsibary ieee i By 
ue oe ee 


“aaque naz to stunt fos beautheg-s yaasrl. geome | a aie j 


inference were directly to the super-population and not to the realiza- 


tion Yo oe 


(dai) It pis van FES (rn) "dest on ft = Tope given by (2.14). 
Civye" Lf wat bE = 0 and p is an FES(n) design, then T =T 
From the above discussion, it comes out that T* and T° 
are both optimal, but by different criteria. If the criteria is 


min €MSE then T* is optimal and better than T° to an extent as __ 


shown in the following theorem (Cassel et al, 1977). 


Theorem 2.4. Under Model Gre for any design p, 


<i z. 
E{ ) (a, -ag) }(1-p)o 
o =.2 = 2 S 
(2.34) CN UAE a (NONE rer tower ep ceemrenrerearcmree ata) 1? 
N 
where T* and T° are given in (2.28) and (2.33). Moreover, 
2E (A-a_) 
Oye e2) i A 2 
a = —— - =} + ———"- - 
(2.35) €& E(t Ay) [Etiam iy! = ] (1-p)o 
- 1 eee. 
where aq = v(S) ; a» and A= N d a - Strict inequality holds 
mk G234)) al fiep:(soe > 0; forasomes ss such that, not,.ali ay for 
k €s are equal. qo 


The comparison given in Theorem 2.4 hold for any p but 
neither T* nor T° are necessarily p-unbiased. However, both are 


E-unbiased under Model Gre 


E G assume that Y ny, are 


Models G R? MR 1? eee nN 


Pie 


independently distributed. Under the assumption of independence, the 


optimal —-unbiased predictor is given by the following theorem. 


31 


HTo- 


e : : y mn 
~srilee7 9u2. of gon bre ao isaliqhq=tegie) ails oy whe yeah 7 
’ 1 F # ‘ 
q : \ iwary 
“3 \ 


(Ah.). Ve -cevie. 7 = Oy ety dant (cant ae abe e 


| | | bse 
* eat p 2°" treci;! pe teed tty — & ar q- blip O ar sal ce 


oF foe WP» 49 -oBo ctning 32 efiok seine B Eerortee i 
2i éfrsain ) ams 75 se ay mibiot a ya a 
as 1iedee nos "TY ' ners Aeron Laid qo ret ‘s aah 


ay ni: 
#(VNRE ql ara anit rot at to May 


iets. 


2 ha 
e : 
a See 7 
stavGezoM |... (EE SS) bie 185.5) ak. Toy Sy. 51, he kh oe 
( a-sA Es om r ee as be oe 
by lL 4 | vi % (2 o hs oT mn Mh 4 


| @ £ NX 

aalos vu MiSs Take 4 “Pate . rs 2 iy a h. ‘bet . ° 

i m - a | Se ¥ = i. ; 
mal) fs Jor aay ioe: te ane To”. +. c g 


16g yas 403! blot 4.8 gaxoad 


378. Wod .tovewH ised 


rh a cnet ts said omnes ‘gee Pal 


vo | 


32 


AHeOreMm Ze. ) (Cassel et aiy..i09)7).° Let  p be any given design, let 
in fay ere ire) ander err YS (i-f UY “Beal two &-unbiased 
Ss Ss ss s 

predictors of Y. Then if er 1S a product measure (as under Model Gor 
Pr? Ga: on the inequality 

Ex(r-¥)* < €xct'-%)’ 
holds if and only if, for any s « eh suche that <p(s) >3.04., and 

a é Sez, ' woe 
TO = €G-1). =. 90") = &(u'-1_) 
s s 


Etytor some “sy with: “pi(s)) ~ 0) the latter inequality is strict, then 


the former inequality is also strict. 


O 


A similar type of result was derived by Fuller (1970). In 
view of these results, much of classical parametric estimation is 
relevant to finite population sampling. If we know something About 
the shape of the distribution it is possible to construct predictors of 
Y which are more efficient than sample mean Ye For example Fuller 
(1970) proposed simple predictors of Y when the tail of the distribution 
is well approximated by the tail of a Weibull distribution and Ringer, 
Jinkins and Hartley (1972) proposed a square root predictor for a 
positively skewed population. 

Much of the literature on super-populations contains 
discussion on models G and Cur’ We have already mentioned in 
Chapter 1, that idea of super-population first came from analysis of 
ratio and regression estimators. The following theorem due ta Brewer 
(1963) and Royall (1970b) is the most important result under Model GR: 
The PheoTen is true for any design. p and gives the best linear 


E-unbiased (2 -BLU) predictor of vee Here the best 


Bi “9 
ahh (Agiaeb Reeky yee od Vo Set! ARR ite so haa 
bsepiddus3 ous ody UK ap + # fT (be get 
Sieben tlm a exvesiee Joubarg ie b's "a naa ¥ 8 
nihihiians, ss GPs ris 


a oe 


* Foe cl . ens . ws 6 ng 


i! ‘ 


a ' 
bus! .0 © (aq aes dingy = De, are te 182 eT? ‘ei ine 


Ai . ( 


9 — , J « & iY Dy 
CRS = CONS 3 CRESS aR re 
¥ A. - PaTAS yom 


1 i ' 


7 h 6 > 7 

weds, sotaaec el vallevpent fost oda) DO Xeyo aoe 6 eee: 
iT wt 7 . F u 
- D2igte o8fe BL YoLi pw pent 


ot .(OV?L) sellet vd bevtaal eyo leget Ge equ Yekae | 
et ntinpeldte sixtemetaq legkeds i Is Foam .2iivess seed? low 
‘ i et 


TOR. Gre ieee wOiA aw ai veeietL gman satin tone a aaa ey: | 
x , 


oe a. 
sa JsPePew ct Pe dheade ae, iat rod Rybiayegs miabs Hen 


a9ifud. sbinead 30% ok ies akemee: ait te aie 

misudrvieFh ads 335 Lay» ga ain Y Jo orp) 2theeg Signe be 
Taga! bas a6 basse La Lays . 29 fas sid at pbx 

& 20% Tow Sibexe, sous, mine 8 bipaointril bins hsis ‘sah om al | 


PebeIne, session 20 OINILEISI EL at Fy in . 


ore 


7 


38 


is in the sense of min E MSE. We shall denote this &-BLU predictor by 
Tar’ It also comes out from this theorem that TaR does not depend 


explicitly on design but its MSE does. 


Theorem 2.6. Under Model GC.» and for known auxiliary variable 
measurements x. in OSs aa aiensieie 4s): CEs <e- DEL, epredtictor of Y is; 


for any design p, given by 
; = £Y + (1-f )& = 
(2 oe Ak ae (1 £8 a 


where 


Q237) g = L u(x,) } ace) and 
(Ce) = oe u(x, ) : 
Furthermore, 
Er } =) 8) + a ) u(x, )} 
S S 

(2. 38) EMSE(p,T,,) = , 

BR NZ 
where, VAC.) = Be / ) (x /u(x,))- 

s 0 
Special Cases: Let us denote Tar by TERe 2f. uo x®, IF 


u(x) = x®, as assumed in many earlier literature of survey sampling, 


we have the following special cases. 


Goal f GU Gk). Se aed Oey aoe el then T aR = TR? the 


classical ratio predictor 


ed retaabysg. UNe3 ated eponsh, the ot Be) eka: ae 
hnsyab 308 aneh ag? tad? ec, abd. mos ‘too ania 
| yaseb, SUS ast-aud pn 


= 


re h ie ; 
Si dettey akLinds ayn to? bur a isan i 


a - 
ip wi 
ak Ye zogaalosg tue we ra re 46: i a 


ats ‘ Te sve. had a a 
Bi 


x Yo 
(2.39) ™ 7 
x 
Ss 
with 
- = 2 
xf (Ne fo) ae 1}o 
S 
(2.40) é MSE (p,T,,) = ; 


The predictor Tp is €-unbiased. 


(GG oO) Wid Ee auch Gg) a es Pen pte 2 then 


2.41 = Xx +f£ (Y - x 

( ) T3R2 ts eee fee ngs ee 

where 

(2.42) pe ae ee 

yx v(s) a kK! *k 

nx. 

If p is an FES(n) design, i.e., v(s) =n and assuming ay aera 
Nx 


in the Horvitz-Thompson predictor Tt of (2.11), then we have 
Tar o xRy Royall (1970b) has shown that if (i) p is any FES(n) 
design, Cit) nx, /Nx Selrrerorag ki be aaeyNy > and, (i114) He is 


a non-increasing function (usually QO < g < 2), then under Model Ga» 
€ MSE (Pp; xR, ae EMSE(p,T, 55) 


or in the present form of Tar? 


EMsE(p,T..) 2 EMSE(p,T, 9») : 


It is clear from the above that EMSE of the strategy 
(p»T,.) depends on p through E(.) in (2.38). A pre-sampling 
judgment may be required as to how p_ should be chosen such that 


EMSE(p,T, 5) is minimized. Under model €, after the sample has 


already been selected, the inference problem is simply the classic one 


34 


ne 7 A 7 1 
ive prs x is ra 7 
7 - i - cs ot 
, 
5 ae 
t 
2 y 
ft , oe : J ay ' 
: Vi) 2 eas 
Z : Wy 
+ 
_ — a ove to tran a = axe 7; wet s 
= 
) ; mt 
WeIehncm!-3 el at 70 
Soe Z 
7 ‘ 
rp Sep og adat ye eae 
® / 4 
~ Poy ¢ 
Ls = “3 r 
PRE om sae | 
i) oe 
» 7 ed 
bY 
7 e 
a" 9 
¥ / a “ . ‘f- a 
i ey, 
J a Se ™ - { ry 


a ) nt? aes ol 
= 9 SUMMERS nic f= la)Y 585k  oekeebe (See ts: ab 4 
{owt frre) aA mat MIBAISZG coe IT? wal 
t : ir : 
DyeEy \vgesed oo) CER EE aS, asda * it 
J _ 7 ft wre won xa i sncen), u Leo, ae oe we 


1 Pat 
} 


Low ses Eee } brn, Mie » «ed a] Pi 104 . > a ‘ale kd bh i vib 
ry a 


OM isha edt i ap ebesensy ord 


, i ty saat: Stone: es 


lid Sets a 7 


o> 


SP) 


of predicting unobserved random variable Ys and the sample s_ should 
be one which permits a good predictor. This idea of Royall (1970b) has 
been criticized by some authors for adopting purposive samples. 


Expression (2.38) can be rewritten as follows: 


; ae i Pe 
(2.43) €MSE(p,T,,) = E{ E(T,_-¥)°} = 2 BIC m)° E(6-8)"+0" J aC, 1. 
S S 


Now if our objective is to find a design p for which this is minimum 
then we have two options; 

(i) to select a sample which will give a good estimate of the 
expected value of the mean of non-sampled units,i.e., to choose s_ so 
that | 

+ (L a)* 6-8)" 


N — 
Ss 


is small, or 
(ii) to observe those y-values which have greatest variances, so 
that only sum of the least variable values are to be predicted, i.e., 


to choose s- such that \ u(x) is small. 


Ss 
So it turns out that for wide class of variance functions, 


the optimum strategy is to use T with a purposive sample s of 


BR 
FES(n) which contains the n largest x-values of the population. 

Formally, let i = {s : v(s).=n} and s* be the set of 
labels such that 


(2.44) max ) = j) ; 
book ee v 


sx 


and let the design p* = p¥(s), such that 


paisa) 6 aad ait Sa Pulaieaeeit 
emi (60782) areas i ali slat. eghtbarg bang + ona 
. wlquar evlenqiey gabyqabe +9F grtute ral ve 8 

aeOLLSD st Tiers brags at gio or 


. ay: } : 
tee i 
were 


Pei wp >) 
mywisin af elds db tay wot cl ee 3 had? (ot ot eso, suer 2 
“Ge 

ropeage owe ovnit 6 

wt So qvamtives hoog = evra laze soittiw hl ye 4 soshon 93), 


Gemudg 22 4.8.7 87 teth ed ehbeemee PO, wliew ils Te avtay § 


ve ,§°0n Tes. 1aecreerte ored nsate votLiN ry weit 


Out qbatSlaer4, ot oo. Sth, ame Ry. sh ighge? cma age 


| A 
Thane pt Pid i ge § 


Se % 


Sanat ey A ane ay inj (av ae sags auc . 


mah ; aw 
io @ aigeoa wekeoqaog, & atte nat wig Gd EL 


_ eset a SO Span samp TeT 5 lth ukMARES | 
Me anata ad A8 Bim: fapsitae eh '= Ny. api satiate 


36 


(2.45) p*(s) = 
0 2 Be Ss 7 s* ; 


Then the theorem of Royall (1970b) follows: 


Theorem 2.7. Let p be any FES(m) design, and let p* be defined 
by (2.45). If u(x) is non-decreasing and WAI is non-increasing, 


then, under Model Gp» 


€ MsE (p,T) eA & MSE(p*,T 8 


BR 


where T is any linear €-unbiased predictor of ve and Tar is given 


Dye 2, « OO) ce q 


Use of this type of extreme design is open to much 
criticism. J.N.K, Rao (1975) points out that there are, no doubt, 
situations inwhich the extreme design p* can be highly efficient for 
prediction of one y-mean. But in most of the surveys, we also estimate 
mean values of other characters. In such situations extreme sample is 
not likely to work well if several means have to be estimated in the 
Same survey. So, it is preferred and safe to use simple random sample 
in the case of multipurpose studies. 

It is also obvious, from the above, that the result depends 
too much on the assumed model. If Model Ge is not true, that 
phase Laie aCe) = 8 = y m?7i1 (m=1 in case of G,)> then 


E-bias of T, is 


N N 
9 St rar m m 
CEST A 9 Va aLLC) x)! } Se a } x) / d x. 


Simple random sample is likely to give small bias in such cases, but 


- te We | ab 6 


pentis) en %q seh has .wy peed tate “Ke ad @ dud 


. 
ganiunotoninaad p2 “eV Eads 823: ecleqsraneb sar ak cau: ce 
| ig tabelt ream 
7 / 
Laglstehcen 3 Te ee : 1 
a i ° a 
f apt bear ,F Ve tosdtbard beeSrdiieg Genet! Yas: mi soi . 
af , Ms = 7 a 
aay ols. tat By 


dou oj mene 64 gutesh epestaa % ayve orm ww seg 

dunk Co Lath oped) Ghul} yee ig tey (eras) ik ht 

“92 teatae Nia: igh is @e> Sq aadest na asa ‘ade Athen 
efeniace oad oe  areLwL ei io seme ah ais “ait Seu Se 
al ot gwen any =aohoertbe. ope (at para e. 
ei: a> wsamtzan ey ag eved anced iwi whs ba) Lite apo oF: ‘ehaain.s _ 
ates aitahs slqnis aes oF Pike bos beruteng ait 3) a aidvadalieenill 4 
i | methine seoqenipey iim 40. pone was, 
chemgeh slagen wis techs er ets modi jade Gites ax i. 
daty ~age). dad 2 jo Visine 1 dbok bomen aul alice oel | 


aes) Gh eee tee) the Ree 


7 NRT lap epreua. salt 
= ion ob ea is iyo tt | Shqaten | 


37 


extreme design p* is supposed to produce higher &-bias. 
Results based on Model Ge can be easily extended for 

Models Cur’ Various results under Model Cur have been given by 

Hartley and Sielken (1975), Royall (1976), Royall and Cumberland 


(1978a) and Tallis (1978). To present some of these results, let us 


introduce the following notations. 


El) a) X. Be. eo) x40 


Ss 


oe) 3? ae (AGE UE: a7 V : Cy .Y_) = 0 
Ss Ss Ss 


where. YS is v(s)-vector, i.e., vector of sampled y, values , kee. 8% 


‘ae is -{N-v(s) }+vector, having non-sampled Y-values as its components. 
Ss 
Let us further assume that in both cases y, 8 are enumerated in order 


of increasing k. B' = (Bigcce2 8.) is a vector of unknown parameters. 


Known matrix X, and Y_ are of order v(s)xq and {N-v(s)}xq 
Ss 
respectively. Let the row vector corresponding to unit k of X, or 


t — = 
* be denoted by X (222s Xa) where Xe 1, for all values 
of k. So we have (q-1) auxiliary variables, Agere 2%eq measured 


on each unit k of the population. Diagonal matrices Ne and V_ 
Ss 
are of order v(s)xv(s) and {N-v(s)}x{N-v(s)} respectively. The 


diagonal element of unit k is us» a known quantity. Hence 


Zon et 
where oO is unknown. 


2 
AoE) =o uy - 
Under Model Cur following theorem due to Cassel et al., (1977) 


gives the €-BLU predictor. 


Theorem 2.8. Under Model CR? and for known auxiliary measurements 


X, and X , the § -BLU predictor of Y, for any design | p,'+1s: 
as 


Merd~3). sadgant noubaay oo beangque ar &q 7 
20d beduatim vidJaas 4H TOS < Deb wo“ boese esinuell > > 


5 bs ae 


1 owety nesd avnd . Sbeiel Sebhe etleede aaakwal) ae: 


Me 


Mest) ban [howd eae en Lie wos <ieteg aiottoe® bas 


inf \latiwecr aaaey Yo. Gane ‘ebtibws or  .iftes> sahieR ame 
. Jearit: ear anbwoltes wet | 


ty te . ak ED 
é Ps ; . 
i = : ‘ an 4 Po *. yree * g 

'? N CORY |e 4 AGOY. 


4 53 
, | . 
heatqmé la Tue \ Oat sweety at Ff 


Hiern Ak A eee went Saatomciom getond 5 weataell Gabel . Mile 
-— 


toed a 


a” Ip¥ee ds F-aed 7 Fie Oy Mpee st? sate wtve An sudoxy? oor 


jaa 


‘! == e {re wa he | a 7 C= 7 ‘j YY, A. : a: ‘> a ‘- ial p re ri 


toe) am OP heby ‘gubap de: ase - Lar az are 


i Sa 
> Glos “3 Skthetigestves Sok ah nis rar | , a 


+ 


*. ee 
(avey, Ble ath 01 5 ai ea re aaa Fall 3 ‘a + ma £ 
ONTORE  oyy eee ee eedalaer vans keep Rag. oH ree a4 rh 

Py bot, LY Reese taeegehe seitinipgoy ade Yo a iam ina ‘a 

ai hay toebaes > {Caden igie#) bee nye tape radio to om a 

ee ee ee ee —— | 
é spricsacame, a ae yeee Sader 


C004) Le a Laeead ar mols aun gateetiar ec Sette satin 
| sessing Di Ui-} ots fom 


i . 


= a 4 
vr ppm = tag oa 
— eee | ie ae. 2 pens Le i _ it 


e = Y ae t 5 
(2.48) TRELU ae ae _ ee "s PRL ( 
where, m' = (m_ yeee gM a eand. £0r traced 
Ss si sq 
a ) x, /W-v(s)) : 
si ~ 
s 
Moreover, 
: a a - -1 -1 La 
(2.49) Be akon Xa). (KUVFays 4° 
EMSE(p,T,, ,,) is equal to p-expectation of 
Z on iz 1 uh z 
: ~Yy° = = + (1-£ )“{m'(X'V_-X_)7 
(2.50) E (Tgp y¥) ee ea ett nim OOVEGX Vim bor) 
N = s s 
Ss 
a 


It is clear that under models G and Cur? €-unbiased 
predictors are weighted least square estimators. They do not depend 
on any particular design. On this point Scott and Smith (1974) says, 

"The fact that the estimators do not depend on the design 

P(.) May worry some people, but it seems to us that when prior 
knowledge is so strong that it can be specified by model of 
the form (1) (simple linear regression model) then the relation- 


ships expressed in the model should override the sampling 


scheme for certain purposes." 


Obviously, model based inference depends very much on the 
model assumed. So the natural question is, what will be the behavior 
of optimal predictors if the assumed model is not true or deviate 


slightly. This leads us to study the robustness of predictors. 


38 


¢ 


« OY Nye ha ae: = 
é ss “= 


& 


pie 


30 ack snardaih, cia Coed ik Ags 


Oe ie oe is a A 
ad $e «= 5 | 


ris 


bouskdieg ag bmn ght Shote sabinie, Sah deeds at at on 
beves jos ab want .paniqalasy exstips (jabet Beira lene: Sn ey Om: 
eyse (TREY peket que cx608 fete ox 0 agra | | F : 
| Spheck wis ay howeeh 60 ob ee ee Nts 
*dhIy pwr soil muriea | apeee, ¥E-aeH rae eB 
ts Laban eh batotun 6A 24, sis an a Sica! a 
“Sonn ots al abe dealeartceeet inal shaped en ed es 
Salons sto setetees Ute Labor ods at ddaniingte eg , 
¥ "evahoen aise Yoh aman, | 
+ fs oe doe wt. mcs baht La! alee 
ay he ste A nay cen Sa Ek > om 


P= abbesd 20 teh 360 At Lela hamnn of 3h 


39 


§2.6 ROBUSTNESS IN MODEL BASED INFERENCE 


In real life, it is not known which model is producing our 
actual population. So, whenever we have doubt on the assumed super- 
population model, the correctness of results established in the 
preceeding sections becomes questionable. Royall and Herson (1973a, b) 


first discussed this problem with a polynomial regression model: 


C25) Ye = h(x, ) + ey, ee De can 
where 
fa 
Lee. h = de Biz 
( 2.52) (x,) lL 3,8, 
j=o 
and ie = 0 or 1 depending on whether the term xJ is present in 
the model or not. Also e's are uncorrelated random errors with 


Zz ; 
E(e,) = 0 and V(e,) =o u(x, ) A k =1,...,N. They denoted this 
model as Cee :suCx)). 5 Invour present notation this is a 


special case of Model Cur’ In particular if 99 = 0, eT =] 


and a5 = Soe a5 = 0, then the above model reduces to EiCOe tes (x) 


which is Model Gp: Let us consider the following two cases: 


Case 1. Misspecification of variance function under Model G.° 


The Brewer-Royall predictor, T as defined in (2.36) and 


BR’ 
(2.37) is §€-BLU predictor of Y under Model GR: The form of T aR 


obviously depends on the specification of u(x). In previous sections 


g 


we also introduced Model fo which is Model fe with u(x) = x and 
g 


the corresponding €-BLU predictor is Tare" If the assumed model is 


is supposed to be optimal. But, if it so happens 
Grea! then TERe0 PP P > 


NOAM Nadi cannon ct teen 


es 


2 7 yAtsehes¢ el SAace mol ead iy fan |8f ai paver seca 
- a P 
* So 
os ovat on ‘evendcw. Ob “abesmtoqoe sae 
a * 


tea a6 £4 ~ hb 4 29  abactce v1 = 44 At alee! ook rntaga 


oy 
—_ 


' . 2 Lipp 2 thos) -eeotisad apo isoae ica 


F . A * ba ; 
' Lnduoreindad Atte qui deep aes Shenae Sg 
‘ 
i | o- 1 
> AGL) sy 
{es ia 
i 
; at } 1 
7 i) : 
ay 
i 
i 
¥ t = es 
ee a rah Tal * é : 
; , Ti 
“i Jawueta «2 “ 2% sdy =6ngSshe no golLtyeqan iow @ = v8. 
~ rei 2 4 wi 
> andes f + oe a eer ie i r -_ ») 
‘Se Pye > re T 2h ft th ea ali ae) oe sig 
i ‘ Ne & a, - 


wesc tahlcte v6) ag Agaeeall ieee an arr 

Lat . = 7 

eins: 1G Sregeng,2 mI Apa & or cevsiiatt an eb 
G ak. . 


4 

_* 

a) 
~~ 

4 

i 
os 
a 
2 
-_ 

i 
i ; S 
bart 
+ 
¢ 
4 
F. 


A}, Oy ae | aaypba? co) sit . . is cy ry "] TP m P. : " 
27 ee ee 
Y OWs We ere aia 2 sbke oe sal. cate at inte 
f s - i. 


Cah, S Fly 7 


=. 72 ; 
—_ & 


ia a ie nb ams aouwy, gael fev a ul mnsanoaanat : 


— > a eb. % @ nh , » BY } r 
Rom. = haionh 20 oh meee SLealebainen 
2 to. apa idee yt Fade’ x < sos sth 


ly 


Piva: a TA {ep 


— 


ioe .. # eu ote 


vat 


40 


that the true model is Cre? B1 # By then in general TREO is 
no longer most efficient, although it is still &t-wnbiased. In this 


case the prefered predictor is T For fixed values of By and 


BRgl” 
81> neither of which is necessarily true value of g, the following 


theorem due to Royall (1970b) gives us indication for preference for 


one or the other predictor. 


Theorem 2.9. LimOgs By < g then, for any FES(n) design p, and 


fig? 
for any specification of the function u(x) in Model G. such that 


Uae 
u(x) /x is non-increasing, 


Bits 


< 
ee E MSE (P,Ta 31 


(2.53) é MSE(P Tana 


& 
For any function u(x) such that u(x)/x 1 is non-decreasing, the 
inequality in (2.53) is reversed. For strict inequality in (2.53), 
itis sufficient’ that p(s). > 0: for some —s such that ~v(s) =n, 


Qo apd A 
and x # x) for some k # in s 
Case 2. Misspecification in polynomial regression model. 


Theorem 2.10. (Royall and Herson, 1973a). Under the model 
E(Oqrdyoee9d5 : u(x)) and for known auxiliary variable measurements 
x, > 0, k=1,...,N, the §&-BLU predictor of Y is for any design 


p is given by 


(2.54) Tygri tan taClné~) bh 3.8 


where, for j = 0,...3;J; 


: i 


a! x lesonms at. eas Pee ry 

aida il beatae} Dh dee wet Hhyslggodsie ong 

: an 70 Saelev best? set sue selmi aide: rien id 
avuiertiot- t83 <9 2 sale apt Lai aenmiad tk todty to 3 2 

ol sunestadese 2c: ots SOBRE: ay 2ov ty tenia Stecgoat “on” ret 


a Se a Se 


— 7 


be .Q: elem. (CONT eee S02 she wh 2 gt = > Oo. %. B, 
je4a dota Pas sS0mt nt tale oS tans err “ nano X 
be 

(Téieaarsatroog at 


. bait : aie IOS 


fa .aetvasipsb<ton iat * (nde Sad? dae ne ane 
CoS) th) Po Pimopaws Shieee: 16 neevis at: cores) ae 
x ted shds ice @) Jaded “Os oss, ‘an ia 
| Re a 5 
. | ae . 
ae & as A a oe 


-Letonvnok baer ga Le bibatetageah) 
, a ogee > sf : : ‘Y 
a? 8 - 


Labret ‘Seba - «cable ‘seh ills 4 
vavagutasiogg aftatter’ cael 


ay bias i eens 


41 


C2255) m = j xd / (N-v(s) ) 
sj = 
Ss 


“a 
and ae are the least square estimates of BIS under the model 


E(d, 59 


0 prrrr og > Ga) 5 q 


This theorem gives §-BLU predictor of Y in the 


t 


situation assuming that 8. Ss are estimable. 


Royall and Herson (1973a) analyzed robustness using model 
&(0,1 : x) and corresponding predictor, poe ie =x YX,  UBY: 


Theorem 2.9, this predictor is €-BLU and is the classical ratio 


estimator. If, however, the alternative model §&(9 vo9 dy Fan x),) 


0? 12" 
is true, then the preferred predictors for any design, is given by 


Theorem 2.10. Moreover T. is biased under model E(8psee+2d5 Se r):) 


mee( lt x). Thesbias ot. TT.) 1s 


R 
(2.56) &(1,-Y) = " a8, m, {(m,, fn) - (m, /m,)} 
where 
N 
sail ee j et j 
"sj ~  w(s) os Bic "5 N QL x 


ht is. clear from (2.56) that &-bias is zero if 


x 


(2557) m., /ms4 = /m, 


rorvallenc “such that a = 1 in the model E(8qs-++995 u(x). » The 
idea of balanced sample comes in survey sampling, from this relation 


Royall and Herson (1973a) defined balanced sample as follows: 


Balanced Sample: A balanced sample denoted by s(J) is a sample 


_ 
- 


D i J = ; 
vies at: ' a a A Fr 
2 uf Panne 
j i-. : | \ alta Vue ; 
a r . 
: 7 
‘c woaweedees ov» pe a haeeas ap it satel 
ni... 
va § om, 
i ~ aR Ss as cio > 
es Yo <4) 
. 4 > a 
vid | | rer) TS cetig oe Teens on rs 4 
cag 
sar gt eds, 2°. 3 Sad eel ween aorttt 
‘. : rae 
PRETO conte beh tage <0 2 7 
/ : 8 
= i fe “ _ i) 
F : a leat aoyes sys Dinw (x 1-1 O73 
: . : : a y 7 — 
4 7 bk rong Metg zis? 2 exe oAiT 
| 7 ; ) 7 : : 7 
ae Ly ys 
be é a 
i =, ae 
a 
°, mw ie en in i | » 
; i= 


— 7 P - - ca | o30s 
_ 
- -_ a _ i 


eer aie Dae. 

| a ee i > da _ 

Peme og. Ye te 

— é “wit ve 5 gbiwnee p62 Keb 
7 - 7 ¥ pa 

“lar etd sere ‘EWS amen | 

_ hd he a on 


| 


Stir fd Os oS oils inf 


Sig: Os 28 Soxomaty ot 


seal Fis at 


— 


> 


42 


Saciatying.(2.5))) 10raut) Selec, Js4 thatuts: -s(J)1iis. such! that 
(2.58) m ary iy Me a at ae eo 


A sample s_ such that (2.58) holds for j = aye < J is said to be 
balanced on the Ae moment. A design p which selects, with 
probability one, a balanced sample will be called a balanced (sample) 


design and will be denoted by pb 5. q 


It is difficult to get a sample which is balanced up 
to ath order. Simple random sample usually gives an approximately 
balanced sample. We shall discuss methods of approximating balanced 
samples and their alternatives in Chapter 3. 

Using a balanced sample we can eliminate the bias incurred 
by the ratio estimator TR? Lit theractualwmodel is miE(1,2 2° 1(x)). 
But to have this property for the estimator, he shall have to 
compensate for efficiency. Assuming FES(n) design, Royall and Herson 
(1973a) compare the balanced sampling strategy RaAL = (pb, >T,) to 
Rapp = (P*.T,) which has mimimum EMSE under £&(0,1: x), where 


p* is the optimum design as given in (2.45). We have 


EMSE(R yp) = min, (%/X,)(I-f)%0°/n 
s< cd. a) 
and 
EusE(R,,) = (I-f)x on, 
where 


fale ete e ey (s)u= men 


Therefore, efficiency loss is the absolute value of 


tio distta 2% (Le ef Jest (Heder ed ={ Ao ¢ 


i or 
a ee a alt 
ados bige et 1. > ak = ft tor ebia# (BE, SE} aiasd ‘ibe i 
; . 
dviw ,2ivefen dott  q jeteeh A -2henom aaa 4 att 1 
[quse) baometad = belles oh ELtw oC rpatives boing er at . 


iq ee bononet: netoe tae 


qu ibecnaied at potrtw atria we, dagho3 = ried 8b sists a. 

as eethnex iat ite wer be mash mie oiieetadenbigts atigete 7 pie 
bsauntad yotiemixorags to tits nr “gat anb trade gt Lata 
f etait gh est igenayln dane 

fiexyusntl ecid as osanintis o28o.py9 ecto tars jee 


me, ais 


a a 


(Q@Qa + [6803 eb tabom Eaptok sag tt ig 2 Jajnintaee by 
> svini Ifade ‘ah. seJametay edd 262 ‘fesdomy a 
ato hee t feives pias (0) 08% puke tA sem ba i 
os (Be 9d) = tag” yoeierss auhlonse ere pe 
vied fe : et) 3 eo atu S oie Si haz ee8 gate a 
SW bbe, 5) ait newte a6 ie act 
a\ ok Pf) ehcp) stten - eee 

aa “ 


> 2 


ee eee ja oS 


43 


(2.59) min weoew sens = he < OL 
gehen iat 


Royall and Herson (1973a) have given some numerical results on efficiency 
for different types of populations. General conclusion of their study is: 
Shape of the distribution is less important factor in determining 
the efficiency than is y, the ratio of extremes of the distribution 
with finite lower and upper limits of range. Another result is 
that the protection against €-bias is often costly from an efficiency 
point of view. 

The balanced design, Pb> protects the predictor TR 
against E-bias which: would-be; incurred if “(1,1 = u(x))- ‘not 
£(0,1 : x) is the true model. The most attractive property of 
balanced sample design is that if conditions pag = a eee ste celely 


are satisfied, then T is protected against ¢-bias under any model 


R 


ECdgee++995 : u(x)).° There is no additional loss of efficiency of 


(pb > To) relative to, (RK It is also observed that T. reduces 


R OPT” 
to n under balanced sampling. Royall and Herson (1973a) have also 
shown that if T = TCdgae20995 : u(x)) denotes ¢-BLU predictor given 


by Theorem 2.10, then under balanced design Pb 3» 


TDS oO aidaie wisn scl eu) 


15 ; T(9) ss 3 a geuescly OMe ti) cect. 


Z 


ote TO) 294 90%5 


for any configuration 892 dyrrer 9 dy Gt es 0i) ance 7Lose 


The idea of balanced sampling is also extended for the 


classical regression predictor, 


(2.60) T =weavew +16 (Ke x iG ue 


Fé | ar, i ay 


— 
~ 
3 
‘ 
Late 
ae 
a 
ee . 


By} \ 


yo/retg tT26 xb atfeeny Lan hy SaRHT sora bet ovhal re ae 


r , 


as YbkIa thas potariaans Levsawe eon secddy so baeita 
eulaingeaveb ol zotve! ee a @asi ot wots 3 
iF 

Acre 4. ath at 10 eames 10) axaur oat 7 wh, a sys 
Sl. tiveasr retail ual, eae patinbs 4sqqu Sik a 
vores 542396. ie mond nay fede ae aBLeNS eee ao 1 

, f i, ) 
f Vorulbewe sto EFDGISIG ‘(Rae ey téeh fdoan Led od’ : 
jan, (edn: 139 134 bem Sadtiind bho do kal. colin? aaa 


Pe os 


id yeaoyoTe sv hyanrges faeq ot Roi ows sad, 2" Geaat 
ee eS Fn a a 4 indy ab” gtrab, prin 


(abou wie abr. Se oS SentaRD Dy; se Bt a) etis at 


iO Vous igitie 29. Hood [6noFd thbs) Oo “at areal. iene 


‘So2Ube2 e jesia Soetgedo pale at at aad od —— 


oals SVs yes "sth aa nH Shs fisvon aribawia hae tpg * 
nities 0357! aq Ude~S aatqimbs Hoy9 rises “is ot, ila 


i 
if 


i Rat isah re a mat fl f <n 


iwe ® (x be aa on — aw can a on Ct : ass singh if LE iy 
¥ ; 3 HFy 
7 


af Y = iy : hee _os™ ute? mas 


o2'E (Bae 0 20 Gyre ene ages 


sf to? fsbus7es. cals el. a bicdslsd Zo aebh: ‘ear 


’ Pe a 


44 


with 
B Feist) nga eK) Yin Pofeiol J, Hemet) 0, 
Ss Ss 


which is €-BLU for model Eqi sl): x)... Now, 7 if alternative model 


Ed: S09 seeesd, > u(x)) were actually true, then in general T 


0 REG 


is €-biased. This bias can be removed if we choose the balanced design 


pb For any balanced sample, the predictor T also reduces to 


Aly REG 
sample mean Y.: 

Balancing the design is on the average equivalent to 
p-unbiasedness, and the prediction sought by balancing eliminates the 
efficiency gain realized under model based approach if we were willing 
to accept an extreme, purposive sample as a basis for inference. 

Recently attempts have been made by many authors to compare 
and if possible to mix both design based approach and model based 
approach in survey sampling. Royall (1976a) and Scott and Smith (1969) 
have applied super-population model to two-stage sampling. Scott and 
smith derived results by using Bayesian techniques and established 
optimality among linear unbiased estimators. Royall (1976b) has 
Studied linear least square prediction approach in two-stage sampling 
and then used a probability model to analyze various conventional 
estimators and certain estimators suggested by theory as an alternative 
to the conventional estimators. Sarndal (1978) has compared two 
approaches for estimating population mean. He showed that several of the 
conventional results can be obtained and reinterpreted through model 


based theory and found that the model based framework often offers 


advantages over the design based one when it comes to present a lucid 


all - Jlexovas igil+ heney, aH ‘tala mie galiaa’y ercaanlias oe tea y os 


a r ir | 19° e ay ae 


SrAqNCS 09 eietses) Veer vd dhe: eo eyes szomerss vizoweat 


QR) de kme Das atoseubme: (Bas 22) Lr nisicae | aoeteiet, con wt 
' bes #5962" wyatt Umtie ‘oReAE ows 62 dubor a BoeLwqneqetogns 


WIEN TOTLE I. 2h TrOSHT, 8 hs @20jautzes pbegyes bas exormet see 


ron | ae o i 
a eae Pie 
| i 
a 
- | : 
. .< 
NEN oat 2 v3 Mt ey Ty ee | 
a e., aes 
rae id 
a, lie a acl 
Yepom ovitameria *% wan hes fs. tain 07, = wa 
gagt | SMteeR ot. woul per) haart saa cor ef ad oe 
Zz 2 
Kore h osatr Led Si Aacods sw ae i ar od OG, aed’ vig oe 3. 


og esoubes = gee! 3 SoAl ie ena vekgune Gariiasigg ye. ot SG 
et, 


= 4 mt 
a ai | anaet me 
, & i 7 aL &, 


01, 2delhviwps verse MAD) oh ra vapipeh add. Batann ied 


=> OE 
oi) ashes gatignl ts ed Sie hidt.sa bliseng oes bse ef ‘Sree Pipes deal 


na 
i) 


i 


anlifiu stow ew 7) uiinoraqe Baska dshon usin” henkeuer shes aiok | 


sorethbnr wet scape’ «2 ‘eigmge arhaaqtig .StasIa~ ee 


ae a4 
Bat, P 
he = : 


pyest Lehor@ipe tye hehe Sake! ae: toad th ane 


ibdipLideame. hed veayasiiacs hiaen iene srtidy wenetdotaa ae 
cbt BONUE)  kiewod -etosemiicea paeacdi et: ms 


art iymae agnte~owd mie “ison ino 0.beae arap pe shaed: leila 
: Lanois usvpos euokaey sects, Ga Lethon wa hitdedorg « bese cad? aa 7 


avn | } 


ci bewagton zed er): bubnes (2toteg.lres Lennlieevaoe oda 


Pe ’ 
A | 


s : 


Bien Habis nassxqyazition ihn frog do ad ag satis tuck spect 
sit sae ie ast as gees ia 
itsot 3 Seer ere sere assis a 
= got a : : : 


45 


argument in favour of some given sampling procedure. Thompsen (1978) 
has given some examples where super-population ideas in survey sampling 
were applied to different surveys in Norway. Empirical studies of 
prediction theory has been done most recently by Royall and Cumberland 


G19 78b3:198L). 


> 


ore Ae at 


A %, 4 } 


CHUL odegmour Loxubedeny aie ivan at So 
gutiqube veverwa 12 saabt ach tahagedsnegie: arate gaan, 9 
to gebhose fame gma .vewroyt at agin sasieasi2b, 08 
baalisdmy? Bae Ilavol ed uitaeses Jape: Ane’ ised eeu? vate 


at 


CHAPTER III 


RANDOMIZATION AND BALANCED SAMPLING 


§3.1 RANDOMIZATION 


Randomization is a well-known and widely used method of survey 
data collection and analysis. The main purpose of this method is to make 
objective inference and presenting results of survey in a convincing way 
to users. Keeping these and other advantages in mind, under varied 
population structures, survey statisticians have developed, in past years, 
‘different survey designs and hence various estimators for parameters of 
interest. These design based inferences are still overwhelmingly in use. 
Since 1960, design based inference has faced new challenges. This has 
been briefly discussed in Chapter l. 

The method of maximum likelihood is still one of the most 
important ways of estimation in statistics. However, for a long time, 
likelihood method was essentially a failure in survey sampling especially 
under design based approach. For any design p(.) and for any 
population vector y = (Yyorrs Vy) treated as a parameter, the 
probability that the random quantity Dy will take a value 


d= {(k,y,) kee eh ee Sod Ven Dy 


p(s) if d is consistent with jy, 
or-ift -y*e Qs 


(3) iat (d) = 


0 otherwise , 


where a specified value d= {(ksy,)5 k e s} is said to be consistent 


= Ga 


Sint od 2) hotiam etig ico seoqdhy aya aa at ei, bi 
Say attkom (irda &! nt yewsve to Be Then tremell hus Sbnerstay 's 
uNtey Iabin .. baie at aneanav be eeuatoltil Sia gana 3 
@762y Jang nt .hsqelevsb svad cnt ae enna 
lo e1s2amTNA x63 aibteeties sO rRey eer btits amakeeb: crak: ’ 
Suy Ot Yienbalieitiwieve [lise ans cophesetah isa’ iota 
ter 2iaY ,sagmeiieds wot beget eait dohirrora bakit aatindb 
sr totqady. 3 Peiuobate, wh 
te0n eat Ta) sna tk de} a peta alias wires Yo! sass 
(omiz grok; a» 563 ov eeOl gai set3e3® BS heaShabon bei 
(ligtooges vol lomns Yevitte ot sti ties is cinta deas Bat boii he 
WG Gn ‘ue re ema “ifs iad Hon ind 

odd Aotameme i ae Ge af = re 
oulev @ Brey, itil ea rae rng aedta 
i) wa rovig ai) [a= aa ne 


. ‘t- st2W ansakepos at ¥ ab (eg 


: pie 
Sia 
me ‘ 


¥ at 
S. 


We 
a wi 7 


47 


with a population vector ina (Yo2+++ Von)» iP and “only if; sen dale 


£6r“aLTV k*ets “= {kj,-+-5k I, the sample. 24 Terchne sec ot all 
te Ry such that d is consistent with y. 

Tt follows from (3.1) that Pr(D, = d/S mas enor [eRe YE) iis 
consistent with y, and zero otherwise. If our interest is for the 
parameter y and if the design is uninformative, then from (3.1) we also 
find that the likelihood function L(y/d) = P. (d) is independent of 
y- That is likelihood is flat, so every consistent value of yours 
equally likely and no unique maximum likelihood estimator is available. 
Likelihood function of the form (3.1) which is not informative in 
nature was first studied by Godambe (1966). But with super-population 
model at the back of the finite population, the appropriate likelihood 
function may be more informative, Royall (1976a). In view of (3.1), when 
the likelihood principle is applied to the survey sampling under fixed 
population approach has the following two consequences. 

(i) Inference from survey data should be independent of the 

sample design. 

(ii) The only inference about y sanctioned by likelihood principle 
is the trivial one that the components Vy fork eis» must 
coincide with the observed values. It does not admit dis- 
crimination among the possible values of the unobserved 
components of Y> since all the values of y« 4 have the 


same likelihood. 


However, with a somewhat different point of view another likelihood 
function emerges which can yield a maximum likelihood estimate of y 


under certain conditions, Royall (1968) and Hartly and Rao (1968, 1969). 


7 ’ ies apy. 23) A) ee 
uh | Cr ah ere OL foe 


Je 
- a, L ae ; 


‘y 1 


er 
tis 26 tee of ni’ ae sb oid «A tse ua a Bees pani 


J 


| 


_ see sadredetin> a} Mat < 
et bh 42 I = c& clo * Gyr satia 0) mond eee 
a3 762 of) tesiedot 1. BP) J eekweerite wise ia: ra ak ane 

sale aw Ci. Lost meds. ork tadors tlie at detasl pee oa x 
Lo oxebiegebgk es . (bh) ¥ - a Aa sein aw mi 
Bh oe Se, pilav mispe rine nr" Re ahd: of heel th 

Sidarbave es Sejamtses boar etal it Peet a bew 
oF pvitlentein? tor 2% dotie Cl zap aituidion mmbiocer 

asl Sabet Sie ©. Wste SHG oe CORRE, sided ed: Belouay,' pamkh 7 
soodeliedts SIHLg Gog ss patie ia a Ajaget" ‘ots we ‘ ~ 
soriy ist) te wety oat sd} Cl) bheyue -svisnitro tiny vz0ar od neal 
Week? sahay got lpoe Yaunwe. 8q%. Os bak Seis at stahgaton 
ieee tats 6 ow? petwo! for aa ear coi 


$02 Ip : hieqabat 20 bluotle ofst ele aid tn 


v 


 ¢ 


oe 

i P s x » , ya A 
¢ Lea en 
oe : m i ' 


teum A 3 RAG ae Wondhoghos sa dats (ann saad : a 
fans. AY 
: ; ‘ te i 


eth ahabs jog seeps)? ‘sion! ‘Bévisada/ alle gates | ; 
| i 

bsirreadcnu ois to zavtsy stiitassy S43. genome rons fae th) ; 

eis oud fh oy 26 sexta: is Sia. sonke “¥ TAsape 


Ae kent Be organ ral Mia hae ha ‘gupnit 8 


b gas "hike etatalt ba ven ia 
te ue a cn 


et “9 Let ii 


wna Ein, oo 
¥ er 
it — ult eT 


48 


Very interesting and detailed discussions on likelihood 
function, sufficiency, randomization, etc. for finite population sampling 
are given in a series of papers, namely Basu (1969, 1971, 1978). Basu 
wrote in the summary of his 1969 seoetaane examine the role of 
sufficiency and likelihood principle in the analysis of survey data 
and arrived at the revolutionary but reasonable conclusion that, once 
the sample has been drawn, the inference should not depend in any way 
on the sampling design. This poses the problem of designing a: survey 
which will yield a good (representative) sample. The randomization 
principle is examined from this view point and it is noticed that there 
is very little, if any, use for it in survey design." In this design 
based approach of sampling theory, as we mentioned earlier, there is 
only one source of randomization in the data. The artificial 
randomization created by the sampler himself is not inherent to the 
problem. All results of conventional theory are based on this 
randomization. Basu (1969) suggests in the Bayesian point of view: 

Once the data d is in our hand, forget about the sampling plan 
(ef, p(.)), which is an artificial source of randomization. In the 
Bayesian plan for selecting the data d, there is no place for 
symmetric dice or random number tables. But, unfortunately, until 
recently sufficient attention has not been given to the problem. Basu 
suggests that any reasonable Bayesian sampling strategy would have 
the following characteristics - 

(a) The sampling plan would usually be sequential. The statistician 
should continue sampling (one or a few units at a time) until he is 
satisifed with the information thus obtained or until he reaches the end 


of his resources (time and cost). His decision to select the units for 


powintatt! 16 aio hanusen, setae en aniaés 
itiindig das alingeg eti.nti- 793 per \ibbaetnaiaas a 
"awe SEBROE CITEL . 8000) waae «aia perbant te) cane 
| fa) sds enw silenrs si” Tans WIEE eka be: estos: 
Bieh ¢avina to aheylows Jas al dtytanrs acts toile donee 6 
Sano (tes. feveulonos a) demons aed vind wim ae 4 
raw as ah Dnbdsh dom bie ite ~aogeiata 4a “RYAN nei and 
ae mwah giles haah to meldead old absow phy js putiomve.s 
not Mkobrey » alt I gms (ovke hangar dea re binky, thw: 
7383 i483 hesiton at 4) $as, aykog WwiAY atin! eer boindmaurn 9 ai ¢ : : . 
oslesh eff at “ agieeb vevauet pe ab 7 bat ‘the al sitvaadse 
at ese0d  Yorsias Petes Siew, syWoaa eral? py eee 40° msrp 
tera tese oft. BIeh ott ay noldeskoeiingn 36": eileen 
id OS inated! tou 9) ji Sensi +shqnge age ins 
aLiag ere S15" YaaSad Anartzoniitep te toe es 
iwaiy to Ja oy Fe Peet eG) ot APN teslnn wai 
dain shi lone 283 oupds 169368 © base aye ae ty, pity ¥ 
tha ii -notan daa a7 at, ska Lgeeets ured bind 
‘$07 aneta on. ad siuel ib wIteb.eds snkoiaeee’ My ot 
Ae tenes <ioaamisd-ye iy. eal ae ok ror ea 
ey meaiderg 28% 63 “mv si, Pi nti betaseas 
Wels Sic uestnbae, Sal head nates yee) mapas * : 


-unkoe ne 


7 - 3 


abe h ebze7% oT -Lobesupne od titnian Bfv0% ota 
> BF a¢ ftae (outta & 35) 
bnioits eounae if amit 0 


49 


a particular sampling stage would depend (non-randomly) on the sample 


obtained in the previous stages. 


(b) The probability that the statistician would end up observing 
the units s = (kj sko5++-5k) in this order, would depend on s_ and 


the state of nature y: This probability would be degenerate Les 


zero for some values of y and unity for the rest of the values of y- 


We have already mentioned. that viewing the likelihood 
function from a different angle some authors arrived at different 
types of likelihood which readily yield a unique maximum. But it is 
to be noted here that in all those likelihood functions they ignored 
the label part k of the data d= {(k,y,) >: k e s} and considered 
the unlabeled data d_ = eee ik GAS) ay basi 19.7.1) pointed: outsthat 
the label past k is an ancillary statistic, that is, sampling 
distribution of the statistic k does not involve the state of 
nature..y i= (Ypo++s¥y)- The sampling distribution of k = (k,>-+++sk) 
is uniquely determined by the sample design. It is therefore obvious that 
the label part of the data cannot, by itself, provide any information 
about y. Knowing k, we only know the names (labels) of the population 
units that are selected for observation. Usually, we incorporate the 
prior knowledge of the auxiliary variable x = (Xp 5+++5Xy) in the 
sampling plan. But this does not alter the above situation. The 
label, k of the data d will still be an ancillary statistic. Now 
the question is: If the label part k is informationless then, does 
the observation part of the data, namely, d.; contain all the 
available information about y? Basu (1971) answers this question 


with a definite 'no' and says that a great deal of the information will 


Me lie 1 . Vi) oe oe 
Es i 1 ‘i f ee : ‘ek 
aay - ay; nt sei) my 
[4 a) iy) - i) 3 {er ; i ; 
wi ey - : ae nan i“ 
aLyika oft, so (ylmobnexeOR) bun inven wanse anhiligwe tile oo aye 
a cut’ in 
ieicaasiaiiald ais o1y te all 
“ . \ : ; Me a, 
‘ “ r 7} ‘| ‘ Daal 
fives2ete ch bers | tiyaw aero boretsese ang) SedG4 « seTiedong ait” ee 
r . 2 
| bigs Livew etre Se taal at ai gal 7 iy = 2 estou ‘sit 
yy ia J 
om | m=! 7 “a 
rs By et Ki pow vr bakdadosg aN ¥ ssa pe ezage Tt 
’ ad ; i ‘ 
ov en? 3 =) ie ame at Php ert 4 Vor bee fiies saieva 109 a 
f | 
ity rhs S271 se Liriwret' sie bene pos veoorle oN bat aw a aT 
Jestaa ith 38 hayivre anqdtve Oise Sigae Wit FILS po mot enka, ay’ 
| = f 
+ eu 
=> San Nitenom Suplay 6 bLary aca rohan iaordtLedlhs 7a. ogy? 
: i” ihe eu % 
vmds, empbesay? sootsiede! aon ihe ot ads cased Daas, ae id 8% Z 
ba7zsbhhanvo daa (s.2 9 4. Ovary ®) eam ot * 3 xe0q iadad, st: 
a aw j A 
Sent Jon bedatoe (£.0L) ead  .te ea ° es io oma bptegaen 
’ hy : ip!) ; ie) 5 
Aphyqfe ,.wt andy ,obigt’ose/ gael il sae Bead et a anne wets ert 
in ae: to 1s 
te gost aga evlownb Aan daquh ‘a _ Pptaeitets, seta eS rrr 
wy om 
(‘A ye yi 8 . \ oe eRe a af 7 am ioe 
rr ‘ ‘ 1 cay = 2 Be mol sud irae he ai by@ae ont hag’ “ely th 4 4, 
Do: { red . : ( ) 
a, vy ct a 


wolinmweseak gos altyoug +1993! way Oe ha ta a ae ‘pe Ioush zy 


, ‘a «& ' . 
7 


9439 zyolvde, srofesana 2b 3%", tfeoD eleymee owt ro) ode sob tai ab, 
; vr hl 


a 


| pale 
(ateaet) ska dz el eit 4 8 a aed toda 
7) HAN ply! Pee Mp sd 


ec ui v7 am Fit et BPD syed rs se 


tO03clugon: od9 


ae claune Phin x ekdalsay, Ciasbl te x6 Hal as a 


i ane retort nett sve “aKy vith sal eal 


+ wy 7 _ Z e 
sa¥owtssaa qbtld: ad 
ecscmuinash ks We 
- i ae emvora \RE 


ab oft ta Sep imokderrsede eff 


, bir ate staat , b (Bah 


a dae . cae ae ae - * ie } ae re 
- 1, 5 _ cay TAG iy : ’ 
msbsanae eit ciseaiaics dd ‘y cody ORgAMpaD oldalitey 
a tS ge CO ay ) Wo 7 d > : is 
isi liga a 
eM it eyo | Pig 8 


ih a is ‘Insti 


50 


be lost if the label part of the data is suppressed. Without the 
knowledge of k, the surveyor cannot relate the components of the 
observation vector y to the population units and so he cannot make 
any use of the auxiliary character cas (X]5+++5%)) and whatever 
other prior knowledge he may have about the relationship between y 
andi x 

Basu (1978), has given a counter examplé (Example 4.1) where 
the optimum sampling plan would be sequential and non-randomized. In 
that example it is very difficult to get any justification for random 
sampling. Randomization is deeply rooted in statistics which is quite 
difficult to ignore with some counter examples. In view of the above 
mentioned statisticians, the main use of randomization is to safeguard 
the sample against unknown biases. Like the conventional approach, survey 
design can no more be the only determinant for judging the quality of 
the data. Basu (1978) suggests that the principal determinant of how 
a particular datum ought to be analyzed is the datum itself. The key 
concept in survey theory ought to be the notion of poststratification. 
"Randomization is widely recognized as a basic principle of statistical 
experimentation. Yet we find no satisfactory answer to the question, 


Why randomize?", Basu (1980). 


83.2 BALANCED SAMPLING 


Prsently it is a general feeling of statisticians that 
artificial randomization in survey sampling should not be the only 
means of inference. Purposive sampling is now-a-days increasingly 


getting justification for the analysis of survey data. 


elo Seeds § <dseeaigiiie oh ae iy A 
S12) De RATOTR PICT * anda sdatys tangs: Jegateie: da 
Svat Jomiss. 98 of bas. ea bin, socdudod! HRP (on ey 


sawateiy bus. Gx, wy RK peonteds 


ae | 


Y seewiert oidanofen les on a pene Lg Pan sina 


=! 


a ah - a ee 


Sietin (1.0 Siariers) aionexs $45 a Pipa r neve aad ct se 

“1 - bbxkMGhine-sor tne fetansdpee) ‘adi bition: open aie 

obey a9 tal teotits aw Wie 293 799 coma bnadn ad ca 
eddy: ab gGabd auc: tekae $e ni “he woos diokb bet ni 

eveds aia Fo velv nl pol ies seatuga: pac Jani. soo@2 ea 

nt edatadi oo af 96l42% atte Big ors ba) ad a on} peanege 

yeweaia <MsSertge Leto tnneVsdo sels ce 6 ai venue "wom Aentiags 41 


I” Wi ideap gt? Aries but, 30% 28e0terr9geb thao: aif oid “exemt Gar as 


wad 29 Sattler fon beg? dite fay sedd aie ii 


obsan Lh 397804 iG aden, sddved’ og Silgiin 
lasisatisye © igh aa gfe! iby Baad 5 ts te | 
eWatyeous af «3 beer ps vnbo ubtetsee or oan 7 


- 


sade enbisMerrase Zo geass Istsapa k bs 


boy sts ed ton blivode es yevrye Sid 


' pignieessosu Appear” ae atti “og 
FA 


odethh dove ae a 
Ee ; 
: nue — . via 
> ort rian A - A 7 fr : . 


sil 


Purposive samples are subjective but there are some rationale and 
objective justifications available for them. From the discussion of 
the previous section, it is indicated that we need an alternative for the 
randomization principle. Prediction approach or Bayesian approach in 
survey sampling may work as useful alternatives. These approaches usually 
lead us to the selection of purposive samples. We have already mentioned 
in Chapter 2 that under certain super-population models, optimum 
sampling strategy (2.45) or balanced samples are more desirable than 
the random samples. 

We have defined the balanced samaple in Chapter 2 and have 
discussed the robustness of the estimators under balanced samples. In 


this section we shall discuss how to get balanced samples and extensions. 


Approximate Balanced Samples 


Selection of exact balanced samples of higher order is a big 
‘practical problem. Usually, not all values of auxiliary variable x are 


known to the sampler. In such a case it is impossible to get a balanced 


Sample. Even if all values of x are known, exact satisfaction 

of ae = 2). j = 1,...,J, is usually impossible. However, when 
J~v~and the sampling fraction: +f" “are~smatl, “it~ isveasier ta-get 
approximate balanced sample s(J). It is expected that a random 
selection of units is supposed to give an approximately balanced 


a) 


N : 
sample. The average value of Ras shower ad (2 samples s, is 
x) for j= 1,2,... . So we can expect that random sample s is 


approximately s(J). Simple random sample is supposed to yield fair 


approximation to s(J) for J > 1, so when we use this approximate 


ron 


bes. s Leaprde: acne OuR axentd. ae sos wie | ey uae estos @ 
te aclearvpath sis. nord eye) se aidelt nve er teas oaks - 
\ 


aifo tw? sv Esoeeyte on beer ey sald pernorl bot as BF Ko TO08 avo 
, 7 
> popes wdabesyvad *o doBotges abt ison% . wd qboatag ‘wotsesimoha 
eae or a 
ri ‘syiqnoagub anedT -doviserrmile ‘tokwes ae Gu {a gil Sone ro ae 
7 fl i » 
‘bistis svsd sY .egldmsalevee@ieuen T¢ ners 4 et 3 od ne] : 
iuisye .etebou morvalvuqo yagi tiieexgs gab dale S 
f i 2 
Jaatesh «tom sve seldaice) BBSRelad- Se Hah Caesar 

4 = 7 ‘ ry te 

r 

a, Temes: wo 
"hee eA! 
sh ; i. 

an ome.S se Teeny ni olqense | Bos rials wis aad tah atte. oe im 


ba;ueted tebay avi eal Ee ai 3p: -peahooudye ve odd, bee 


nia bens 25 u aelomse bsoneisd. sag ay ait patton hs + Isie we AolIree | 
| ‘ uy int 


“0 } 


. ot qupe began es ee alee ite . 


r ee = EY Pee ; 7 a 


a; é § 7 = 


ef yehro adie kd Xo+ Aafuase bo: om yee ° Bd dgbdotee 4 


¢ SQ | AG 7 ay a ry i y 
al deivay vterl revs 36) eoukav iis gon aes ald. itn rng innrve “re 
La a : : yl nay : a re 
pooled & can oF. 9 $dtvearat at ere bs Hee at eo od oF * ot bs 
‘ : ¥ vet 4 er 
south | ; 
- / "Ge 


an ting ni tag soe | | wore! ana tn 35 am tenths Bi ae - eae 
. 7 _ 2 ; ine or 4 ta 
pai .V9vowot ca ade are ei Sabaw er Ty er if pe Cay nace 


to. 
2 bas ad as ah ‘er at feat oa q ‘aabaaaal Sd i i | 
‘ } : ie in of dda Wows: 
hi satin ane 
pyen | a 


a 
"aie: a Sahai beanaques ‘eh t at »& 
| as) 


ny 
Ai, eG 


ry 3 
Ve 


- pesworua’ oy a 1 | 
a 2" rar . | 
b 48 aad ques e a “ie aang 
ee. - 
fb ee! aiornse » HDB hail “990 


- ’ 


2% bigi? og pabeeaos ak: s 


52 


balanced sample the ratio and regression estimators are approximately 
unbiased under sage, Bl dices regression models. The estimators will 
also be approximately optimal under models where the variance of Y 
tsa polynomial” in~ x” "ot degree J or less. It) is true that there is 
possibility of large deviations of the sample from the balanced sample 
depending on the dispersion of x values. If this circumstance arises, 
then it is advisable to use restricted randomization, censoring or post- 
stratification in the data. Royall and Herson (1973a) have given an 
expression for the extent of bias in ratio estimators using approximate 
balanced samples. 

In surveys ususally more than one auxiliary variable is 
available. It is very difficult to get a balanced sample with 
respect to all those characters. However, random selections will at 
least justify some degree of confidence that the selected sample is 
approximately representative. It is to be mentioned here that neither 
purposive selection of a balanced sample nor restricted randomization 
nor unrestricted random selection will guarantee balanced on other 


variables not explicitly considered in choosing the sample. 


Extensions and Other Types of Balanced Samples 


Up to this point we have only discussed the balanced sample 
suggested by Royall and Herson (1973a). There are other forms of 
balanced samples whose definitions depend on the super-population model 
under consideration. Some of these are direct extensions of the already 
defined balanced sample. Holt (1975) has extended the idea of balanced 


sample for a linear multiple regression model and defined the balanced 


vigseminertgge eur STeR HALPER ) 


a aaa ae'T ene Snes gle ya 

to sameeqey ai) aredy -eiebom eine aa el #36 sat 

6! sted? jem anzs eh 2] jeeelge) 4). aeigep Sol = “phos 
Sharse Asigodiied act aoa? 4! vie S63 26. pili tars ogral 30% 
teaetys SocdhSentindis aba 23 esate! x Be ottnaas ata 
seoq Tu. ghittpapes -tolwentmobany Sasniogg? "si ea’ ‘chine 
ovis Sve (660 0))) oe neekiads iu fSerud ep ‘aps a 
scamivea Gah BNP varosemijasa thie? ot nate a ‘seeker wat: ee 
i ae ili th : 

| -_ ‘vila 

23 aiteters (agklikws sat ode ure pL iegpait: meaynee ae : 

id iw. alwond beonated & dee og Restos, cra Pee ae 


é 


ic. Jifw ancigastiee: molnes se sbitnsialkatty seats ata oi 
ab a4liusme } sahil be sath, set soapebanes, ao coal 


I4s90. 0 5a: vndped ie din Ital woh ‘ 


a L guise and Bisa, nk, » bpediunnns Rico << 


vise its one An ‘iii st ie aise to amg’) ; 
bebnstod to sh Bain i Ry se te 


ry 


53 


sample as the sample for which the first moment of each of p-auxiliary 
variables for the sampled and non-sampled part of the finite population 
are equal. Using this balanced sample he obtained BLU estimator for 
finite population total. 

Scott, Brewer and Ho (1978) proposed an alternative to balanced 
sample which they called "overbalanced sample". Their overbalanced 
sample provides more efficient estimators than the balanced sample. 
Principal results of this article followed by using the model €&(0,1:V(x)). 
Regardless of manner in which sample observations have been 
obtained, the BLU predictor of the population total Y under this model 
is 

L ¥ x, /VCx,) 


(322) T. = T.(0,1:V(x)] =a) Yo+ 
° ° Adie: : x, /V(x,) 
Ss 


7 a | 
* 


Let s*(J) be a particular sample for which, 


(3.3) Te 
l x. /V(x,) 


nl ois ini c7 
* 


then the following Lemma and Theorem follows, (Scott, et al. (1978)). 


Lemma 3.1. If: +s =’ s*(J),°othen To is &-unbiased under the model 


(9529, 2+20995 soVe(x) Y. Loptany CV2(x)< 


ivetact:, Ty is the BLU predictor when s = s*(J) fora 


wide class of models. 


& 


Webi hereng ate eine 1d 2ieomnoet ome a OT 
_ ROR, eitntt be acy See ‘bam tak Se jae 
107 yosEREteS ore haglaodo: orf stearate i ‘ 

a ‘ea 

batnaled of avitenteale are sei a COS OL) 6) Kaw’ 0 
Le sepiteyaiio ies a, er ea Retialedesy " nis aa 7 {oes 

_.. ‘gods bapwelsd ods s9ds, xo aibage. Prater hs : 
ixivel. 3s “Debates anten.vd Bawo Udi isco, si atta 
ieee: aya exotsevyaads agian slate! ed a8 

labon <ide asbhbau ¥ feto2 soivel ygog ashy 30 sonaibens tam ads 


(apa Yt 
2.1) —— eee ie es catia ie ee “4s. 
e 


~ 9 - 
() \ ee 4 


“. 3 a lnaderauaiaed 
<a 261 ab itd \ ew 
P Lig a0) a i = ; ; aye : 
ae 


AGBeL) ite as ,TSe0E) Pwolisi mercies? anes nmeast iy j 


= 


(dhow, of taba pe ciied mie a omealy: ome ge - os 


yar yar 30% Bi B sf 


Theorem 3.1. Suppose s = s*(J), then To is the BLU predictor 


under the model (952945 : 095 : V*(x)) for any variance function of 
the form 
J ‘4 
WF Ge) ee (ky yy dae xe 
d=0 Sarr) 


Special cases: 


If V(x) = x then we get the Royall and Herson (1973a) 


results leading the balanced samples, so that T reduces to ordinary 


0 
ratio estimator, 
N 
(3.4) Tie eaten wae Pies 54) xt 
s s 1 
On the other hand if V(x) = 5) To becomes 
(3.5) Tf 2) wer) Y/x.) ) & 
: DY L n RES a 
s s - 
Ss 
and (3.3) becomes 
=m ; 
(3.6) ie SA) a. ie Ds 
Ss s 


Obviously, this is always true for j = 1. Scott et al. (1978) 
called samples satisfying condition (3.6) as "overbalanced". The 


2 : 
mean square error of T, Unders 6 (04.2 x5).— 1s 


2 
1 
(3.7) 5° I x | I x,) 
Ss 


S 


If the sampling fraction is small and no single a dominates the 


54 


youdihtete Us8 nid ob gt wife Le = 8 
ae wok gout iar tan ue roo i Seca 


(eETEL), doexst bre Llawel cd fag adits, Tay Sle “hale 


wisiibyo 6) asouhs Q DBAS Oe i ite 
A, 


- 1K oe 

ae 
east c 

7 Av 
Pian NEY { 
“ 2 

¢ 
{ 


(@8er). fs’ S002 ‘ie e 202 ua? et a a 


oat "benaeduset sien! sie es 06), rosstnon anatase 
at Ceiae sabac, & oe 


others, the MSE is affected very little by the choice of sample and 
little efficiency is lost by choosing an overbalanced sample. 

In many practical situations, V(x) increases more quickly 
than x but less quickly than te so that E992 dy oeee 99s 2 Gx) 
with V(x) = aes Gs Hees is often a fairly realistic model. Both T 


iu Z 


with balanced sample and T 


L 


Z 


respective samples under this model and it is interesting to compare 


their performances. Scott et al. (1978) shows that MSE of T) with 


balanced sample is 


a i 


(3.5) M, = N(Nen) (af + dy 


a) 


N 
cuss Ree Y x* , while MSE for T 
N it Os 


0 ieeuonien 2- 
(39) M, = mo eo + a,x) /n . 


where x is the mean of x-values not included in the overbalanced 
s 
sample. 


It follows from (3.6) that if j = 0, then x_<x_~. So 
Ss 
that, M, > M,- Thus ratio estimator T, with balanced sample will 


be less efficient than using T, with overbalanced sample. The loss 


Z 
of efficiency will be small in general, if at dominates a, but 


can be substantial if as is relatively large. These results apply 


55 


with overbalanced sample are BLU for their 


with overbalanced sample is 


to any polynomial model EC Ops Opaee es Oy : V(x)) with variance function 


Za 


V(x) = aes +a x ; hence T, with overbalanced sample (j = 0,1,... 


aL ic Z 


is more efficient than qT, with balanced sampling of the same order. 


a) i) aA) 


; Ore | Me 
ee ; a é v ohh ag 


bas. ofits 36 sited ‘a t vo see 
-siqnée beomeudrsvo: se Beckwooeia va amos eke ae 

vitctep sxa@ denaersek  GV¥ ehoisayele, bestzousy com a 
(ay¥ poet enighig tS S802 Oe Aa mines Yi'donap owad and 

I S308 ie ae obzet heen Naha +. grt a: gu + alts 
rests 107 Ui exe siqme bos etre) dabkw ft ‘bap newene*h 
steyeos Od ghisassseqt Si st “St age Lsbog eit) wabne, signa © 
dsiw (1 2G SEM gas swore ay Le de aia 


Ve li : , “i 


Wy f 


sii nt 
ma as iz + 7) a = 


J 


‘i Me . : i | 
ei siqns= Seoanledyeve: glw g? 202 Sa Nila ed e mt 


‘alee + “4) loony * Pe a a 
“ah 


peons Ledreb. 9d) bebioad wor woviahds aceon 2x 


ai ; ax a 


af + Uk ®, ra ser 0. Ne ot bids Ce ay” *. 
fi2w' atquha’ (meets Hbw hs Mads si dites aks pit pe ; 
2a0t off eae a ak er Seba) an rer 

did. gm 'otemtaoth ete osu bl — ad) Eide 
eae esiveds sont ar eclsvinetbe ab 


| peRHBaGY-Srtiet sav zh Buk ue Bs eai ae 


=" 
Ga 


ei oe Dp sigue & 


How to get overbalanced sample: 


If the sample selection is with probability proportional to 


Xa) Chen 
i 


N N 
a: ? 
(3510) E( ) x; fn) = ; af x 


—__ 


This indicates that selection with probability proportional to Xs 
yields an approximate overbalanced sample if the sample size is large 
and sampling franction is small. On the other hand if the sampling 
fraction is large, then Scott -et al. (1978) suggest selecting units with 
probability equal to 


AX, 
al 


ee. T+ Ax, 


where A is the solution to the equation 


N : 
(3.12) y ———— = on 
i 


Iterative solution for (3.12) is suggested with starting value 
N 
: eae 
9 = n/ ) x, Using (3.11), the probability of not sampling the i 
ig 


element is (1+ Lots southat, stor ali 4; 


N x 
j al j-1 
(32.13) ; E( ) cae = d ea en E ( : x ye : 
z si S 
It follows that, 
1s) Alay somite eas tev Noles eal tak 
: s s 


56 


ae 5) Lo YY 
ok Sa . \ 2) ion ee 
elie Je a Ph) teas 


See 


‘ i 7 
on ; -) Gy Aoce) Semen Te eed 


ee 


os Tanphstogosq: qe ihidorg: ate as mx 


_% oF Deailaaogos “knteaoag it aa 
spiral 2% wahe @lgwne alt tr eta baoumieitapre » 
‘per faite aris 32 baeerl vedy.c) ott 0. Show at 


fain 29tmy soe2 a Tse seagaue balla td a el oad ae 
Whe |e one 


age ay é ak rn } ie as 
obit) supa a o3 sort att 


ri:t 


i oi3 od | sci to yi I fi. tig) saat oat oe 
arn et oe nt eteds, aia Seen 


af 


such that, we would expect to obtain an approximately overbalanced sample, 


if the sample size is large enough. 


Royall and Herson (1973b) and Scott et al. (1978) have extended 
their definition of balanced sampling and overbalanced sampling schemes 
respectively for stratified population. Royall (1976b) has given some 
different ideas of balanced sample for two-stage sampling. Let the finite 
population consists of N elements and K cluster with M, elements in 
: the sist cluster, such that ; M. = N. Suppose, first we have chosen 
a sample s of k clusters ort then from the sample 3 cluster 
a random sample S. consisting of m, elements has been selected out of 
M, elements. 


Let, 


Ay: Sex}: Mm, 
s 
mo. J Ww /K 
i 
i=1 
“GP ie pu [in : 
s ree 


then Royal (1976b) says that the above two-stage sample is balanced if, 


(3.15) M = M Bie cue ea ee) 


This type of balanced sample gives unbiased ratio estimtors in two-stage 
sampling under a quadratic regression model and the ratio type estimator 
is best. This result also holds for higher order-polynomial models 


when the sample is balanced on the corresponding higher-order moments, 


\<e 


Siqgkhs bovnelndrave vistedizxorgie, 
, \ 


Le sTO B des bois beaveledasve. lag’ 9 thee tc 


ba? a 
emoe roves ear’ (ue TEL) LLeCOR * 


sfinke of7 tnt. .aclicnas Bghserons 2% nis Eh 
ve 
fg Mi ftiw testable F arse . 


eae te Lar? 
nveoin. even ay pearls SOCR a Ae ’e "i | cabal a i) 
¥ 
totet Lo se oi quad ees mers t eis ae “4, ee | 
5 : 


ae i 


ow 


1 joo ettneive wasd est rtnale ta ' 


‘- x ne ra 
m\ fe f «Cine | 


ut 


es Un eee SA. i ., ek 
Ce ee 2 4. 


opederows ob dioseitee otter bone tenw bavig seeie 
Tisamrts egy} oT¥e% sds brs veh Souls ond 


58 


In the literature on survey sampling we also find a completely 


different definition of balanced sample given by Singh and Garg (1979). 
This is actually some kind of systematic sample with random start. The 
suggested balanced sample is: Assuming population size N and sample 
size n both even (for odd values N and n modification of the 
definition is also available), first draw n/2 units at random from the 
first N/2 units of the population and rest of the n/2 elements are 
taken from (n/2+1) ¢ to Sige units of the population with indices 
N+l-r., fraps? oo eg / 2 where ae is the index of the pe unit 
dzeawn. trom: first) N/2“-units. 

This sampling plan has the advantage of both simple random 
sampling and systematic sampling and works best for population exhibiting 
linear trend or periodicity. Their empirical study shows that this 
balanced sampling is generally better than simple random sampling and 
in most of the cases even better than systematic sampling and stratified 
sampling. 

It is clear from the above discussion of randomization and 
purposive sampling that there are some cases where the randomization 
principle does not carry much meaning but purposive selection like 
optimum sampling or balanced sampling etc., gives meaningful and higher 
percision estimators. It may be sometimes feasible to draw these type 
of purpositive samples, but in large scale surveys with many items the 
purposive design could lead to very inefficient estimators for some 
of the items. Rao (1975) says, "Of course, this criticism also 
applies to conventional designs such as the probability proportional 


to size sampling plans or stratification by size with a 100% sampling 


a - 
dy sale ny 


_— df 


ie 
si 


ih 


De 


‘ « 
tt 
i ay 
~ 
hw 
¢ 
; 
Le 
' 
¥ ‘ 
| 
( tJ 


abe F: bee bray alt as sity esate 4 “oR, 9 Ta onytr idl tidal 


Ligotiubta ‘ead 34 seusguhe Etheaiia # tsa ae 
My ona. nt sais-03 SE Aieann N ei wi od, Yom sF- — pr 


be a0 exes nalsae ae seine on 
Das aalai sting ekils Te io" ten 


Ter qotn wana 
‘as 
; 
; ood eRee 4 


f ‘¥ 
Fi A ee es 
ns gare vd mele Siege bas bid) 0) KOZ sash 


: aL ie | 
Pa ae aT A 
' i 7 +) a é ey ie 


| ' hee 


bne> Wal sigue obpaligaete, 29 litt sata. idsiueabiae 
Pet 
oat j aa ee : on 


TOAM GGT ORRURIREA - ihe. a icne Bey ufed haem ue 


ae gastav ‘bo gor » megan, se 


if ettim. § work WER? J (oldeihavs ops at aatital 
7 r ; g : : A 7 ; 
\n  s04 ig teu0 Dap p6bsaiygoe-mis Aovetie SAR Qe 


as i aX x 
a) j 7 ’ 


sta: 22" aoe hy ey GS VP once 


§ c 4 
id - ° $ 3 oposy ; Eis. - © > ai - r : 24" 
, ’ ar + J , 7 - 
emia © £\t- se0kt- eet 
ith 6 fetta fis ) y Fae ates 
2 gu 7 Pe | 
ihe flor to sarcdhevbe 609 -aebaelq gapienine Sham 
j > = § ar 
ioe rot teed savory feu acl! Gee. 222 palebere 5 as sake 
- 7 ‘ oe rb ay | wen _ i ai* Fe h - 
oda vbove’ ised J 2 ONS nage: is »ibede 24 TO Sian i a 
rel is _ 
as Sigwls BES $, 349708 yilotarwy ek “or 
: j ; : - \ iF ; 
food 22 3¢RIeyE iy toxvsad aay . n> SH seme@ 
Wigttead 3256TS evs ce) Toy! S80 aaVve seeks savy 3B Jee 
| sabe eee ae aa F Rida 
a © wr a” 
: ea 
M1MLieT 20 mOlgsionhh erodes aff wom Sees et sh ¢ ‘ 


s dd | a 


et 


a, ba, mae 
abae Sy haonsiy ted Srtnnsn baer ‘ora son ea0b ‘gfotonang 


= a 


or 
nen Fake semen Mine ayia Be aa ad 
~ aes 


ae 


barf 


7 , 


29 


rate in the stratum containing the units with largest Xi In such 
a situation, it might be advisable to employ equal probability 
sampling and utilize any quantitative concomitant information only at 
the estimation state." The role of randomization in survey sampling 
cannot be taken as the only basis of data analysis and inference as 


the conventional survey samplers used to think. 


Age “et i jesprel sady. pre lig 


¥ aeasteys f 
VikleGaiety Lewps naneug 4} atime’ in: 

$6 WIR no} tayiz 1) beet Henk + kerogen apigupaonee ie esti 

TA 4) Grine Gare a} nO TDs «kena to sal att "wane 


a oe 


am Shveveiad tide peti, alle mG Siaed whet wt? os ta 
os) 
a a 
OS mstit? Oe baste pein ee tnt ban: wi Ds 


py r 


CHAPTER IV 


RANKS AND ORDER STATISTICS FOR FINITE POPULATION 


84.1 ORDER STATISTICS IN SAMPLING FROM FINITE POPULATION 


There are few works in survey sampling literature on the use 
of ranks and order statistics in estimating finite population parameters. 
It seems that Wilks (1962, p. 243) is the first to discuss distribution 
of order statistics in samples from a finite population. There he 
considered a finite population ™ consists of N distinct elements, 
say Toi. Yo9 Sie eat Yon: and derived the probability function of the 
sample age order statistics. Let s be a random sample of size n 
from this population and let us denote the order sample by 


th 


X Tralee So it follows that probability of the k 


Ge) a) 


order statistic of the sample being equal to the pth order statistic 
of the population is 
(Fee) (es) 
k-1/ \n-k 


as eee Rae a (t) 


Cass19) PLY ce) = Yor)! = ny 
ie 


where® t = k,ktl,..<,N-ntk. 


We can consider (EJererttner der (1) the probapility tunction-of 


Pyonyk 
i » its mass points bein : 

the random variable Ye) P g FO(t) 

t = ki ktl,...,N-ntk, or (41) the probability function of the random 

variable t, that is, the rank of the y-value in the population to 


which the eae order statistic in the sample is equal. 


ae G0n= 


CARO: dA PT 


_ 
q 
kd wat i 


it 


a mn ula 7 A ; 
are 
HOTTA IO: sears ats aoanarrate sino: a 


a i aus fi = 
wort raw Rea SR owt ascile it sia tty 
L) a, 


gex oh 1H eudBIerL! gn ame eeniie ti) exrow wot i canill 
Maisie iee celssituqod «sink gaF sembaan ne #4 ap ham ahve bas 


7 


porsuditselh adiineih oF s‘ea2S wd eh Sees oe aye) aut Sed 
“af sacs onbiielvets ashekt « wet selonas Py antaqaneae % 
imagoisa: 1n2Iaeh - on tarengo at md kode -< saknt? & tiny : a 
a 


aw, " ag” ’ y : et: , 
| rare 6 al meas mosint & ed a aad ~-290fatse9e sabite , a ry ie ae, 


Vie 


wd 10 mls Sse yi litdedas@ e083. tbeyl tab bib ina <a 


vil, $2 Qume «7 bien oa|3 ‘tone aa i oa nas. matapdngoe & 


Sfi9 20° 9% ‘i L tare rial sna awe! [¢8 ar ac 4 ryt a wevm) he ots ‘ 


i 
altebses8- 19630 * oie? Tnape gifted ieee sad io Dieta 


uty . be Salat o + 7 


1 | 
, 


30 qoitat? esti tusddig adda. ae 352k (hg ua mac | 
tryi9" aittad esngeg aca a2 / _ Oipralery 


i - o- 
Avge ; — Ae? = 08 bed 


~ 


ee ilies i Nia ia 


61 


Wilks (1962) has given the following results on moments of 


t. Moments of t are easier to derive from the following relation 


(ere (Oa) 


(4.2) EC (ttre1) (71 = ——— 
© 
where, cs = (ce Li KX—ErL), and + x is fixed. “ Puttine: r= 1 


and 2 in (4.2), we can get after some simplification, 


(4.3) E(t) = ee 
(4.4) V(t) kQ4+1) (Nen) (n-k+1) 


(oe) 


On the other hand, considering (4.1) as probability function of Y 


(k)’ 


we get 


a 8 t-1 —t N 
Ce. 5) EQ.) = y ° ( i ) ( ) : 
‘ pea) te Nay fae (i n 


Obviously this has no simple form and so is the variance of Daye 
We have mentioned earlier that (4.1) can be considered to be 


the probability that the ie order statistic of the sample will be the > 


oat order statistic of the population. So we may want to know the 


most likely value of t for given k. This is given by the value of 


t satisfying the following relation, 


- + 
(4.6) ea eet) Sr es) cag a Con 


N, N, 


- 


cokietoy anitwalio? ong wont ovens an mata a 


SAY ie sai | Bree 
‘alee * es = {Mapa 
’ | pi as 


f © s sebysy beak? “i + | Bate tines 7 sepia Wd ‘a | 
| motest di ignts weber aetie aye tee wf 8) os a f 


Fane ee Ae ee | = ar 
. Wh Micad Cayo" - .y eater Pil 


P cay" 70. sonsirey jigs ria 08, ben’ sso snes Ad abe 
aH 02 Lovbibenas 6a ah ¢, a ere AT xs bees: ae 
_ gd4, ad LEY 2 pap dots, 397 ‘pidatinbe sobs 1 39) nis. ba 


arg, worn, or 378. enero Be snolsatngag, Nites Soo 5 
a | 
ig. sete on vi anmete ie “a Boia 02" % Ios ‘is 


i ood i 


nal ; 7 ~ *4, r 7 ie - x 
rat , i? aud ai = 7 é Pay a 


62 


eo oe a 
Cote ON 


This implies after some algebric manipulations, 


kd k-1 
hee sop ad aes Oman Mower) Per! 
OG 

sel ie ket iL 
8) Seay ban Ware eee iy 


As a particular case of (4.7), we can find that sample median is also 
the ML-estimator for the population median of finite population. This 


can be illustrated by the following example. 


Example. Let our population size N = 25 and sample size n= 9; 
so that the a largest sample value is the sample median, hence 


ko. So--using 7k = 5 in relation (4.7), we have 


eoNcety se. ONG 
or 


PO Sby Coen Seles 


which shows that maximum likely integer value of t is 13, but 


¥O(t) = ¥Q(13) is the median of our population. Hence sample median 
ns is the ML-estimator of population median 9 (13) 


(5) 


2 


oh 


a] 


ra” = pautanlagan) sahekt) ao sebbsu MOE: ebenog ais ah 


OR él anioomt] Yee aati brite BT ka) 20 4g [Kae ira —e 


siquens aneete, aga, es 
sa An’ ela =ahdnga bes ag ad a aegis” imino eat 
sumed .Watbes lalqetah. sth eh. owl ad abate aasgnes i 0 3 


dveiiiow) (hob) datealed Ge ed ants 9 | 


is ee 
: | a eee eo . 
7 ‘Ss - i a A : ia . 
: oe ORR > ae rN eee 


f cre 0 2 Giaaeiita acco wm 


er 

bd noxsetaayon ; 

7 : Tae 

i 

a * 

@ - 

7 = » ral 
; a we ¢ 
: ‘ 


Median unbiasedness: We have already mentioned that the 


sample median is the ML-estimator for the finite population median. 
Now, we are going to show that sample median is median unbiased when 


sampling is done from finite population. 


Definition. If A is the median of a distribution and A isan 


estimator of A, then we say A is median unbiased if 
(4.9) B(Av<sA] = 9P[A > A] 


For continuous population P[A = A] = 0 and hence the above 
probabilities are equal to 1/2. But in general, for finite population, 
P[A = A] #0, so for median unbiasedness in finite population we shall 
consider the relation (4.9). To show median unbiasedness of sample 


median, we have to consider the following four cases. 


Ci) New and ni, both arerodd 
(2) N is odd and n is even 
(3) N is even and n is odd 


(4) N and n are both even. 


Cases lol Nandi ne -are: both-odd. 
Let N = 2M+1 and n= 2mtl, M and m are integers. 


Therefore YQ (MHL) and ena are respectively population and 
sample medians. Since (4.1) is probability function, we have 
N-nt+k 

y p (t) = 1.° So for k= mtl, we have: N-ntk = N-m and 
Pek N,n,k 


63 


£a et a iy ane bite uN | 


wiht Sets domo larga | ‘ 


oathan qaisgatughe a920rT eda " eae 


node Were kde option et cmilwem ogni nee ‘wari: at: 
ie ie ell alas no a 


> i 


os ei r bin dora: ris pint aw? a y Delt, cast igom ea af é. st Oz: 


33 beactony rietien nial a View + weit a ae 


’ i 


f ¢ " A a 


‘pan< ae ne 7 x Ba hay 


960 «11. ssnea bre 0 = i = ais natsebogey dyobnenes 


vir Th iad 


Gekrsivagy s92nhF io? ‘smith ik aon SAP ‘ot sapien 1 


oh ee a 


iu Y Da oS 

i\égule sw acisaipgog, ooh ae aL enamine £6 sauna ye oe avs 
. oF 
eiewee if dé anbeaul dua bia ea omit or 


CTE 2y0d defo kbt ati” 


wat ti a ‘i 
4 ; ies ek il bab aes) a 
at dai an! A f 


alee ay 
derq 23 ty 


A +e 
ey 
ret 


2 
ed : 
i i ba - 
r 
(a : 
“f 
4 ie 4 
i“ n 
‘ 


= 
of 
ey 


uy 
ie = 
i > y 
wv a ¢ 
es ii 
Sasi 


fee mf = eta sit a8 eran wordt 4 


‘ 
1 a 
vee > © eat ) 4) f 
7? a re 
x ae eee .- a 


ni . t=-1 N-t 
PLY ii (A 
t=m+1 (mrt 1) o(t) ts Le 0. Ko “yf yt 


ea) py GX) 
10 al a ipl DORA 


It is clear from (4.10), that 


or 


(4.10) 


N-m 
nn EE ott) = Yocty! at a Brel) ? Yocty! 
or 
PUY mt) < Zooey) = PM try = Yoceay! 
PTY mtr) < Yoo) = Fl cir) > Yor! 


Hence sample median Y is median unbiased. But the situation 


(m+1) 
is a little complicated in other cases, where there is no unique 


median. 


Case 2. N is odd and ‘nls even. 


As before, let N = 2M+1l and n= 2m, M andm are integers, 


so that, conventionally sample median is an average of a andi. ¥ 


For k=m, we have N-ntk = N-n, n-k =m, so that 


64 


(m+1) ° 


| mn” 
‘iho S Fens "ewsatyal 2 
| | bane eos . 
e - ‘ i y 7 re nate 4 t : a, . 
‘ oo) a ‘ettmyrd ae ' Ceeae ' at 
wae ee | diy eat wh : A 
sad Peale eaall 1d oa apie ee 
spiny. on el, ane Wied aap ‘oie ab 
| BAY 
The . ay 1 * nee 
SIogsat ae 4 Shere " er a er yi 
i> ry 


“ae Baw. tat in seexiva me'gd 


65 


ee l a t-L N-t 
4.11 Ps tee 7 tnt d= se ( ) ) 
t=m (m) 0(t) ( p ) t=m m-1 m 
n 


and also for k = mtl, N-ntk = N-mt+l, n-k = m1; 


itp at t-1 
Chri) ) ine OCEWal (5 a Cy pes) 


t=m+t1 


Expanding (4.11) and (4.12) as in Case 1, we find that 


eon PIX (my * Yoaurty! PU Gait) < Yocet! 
PIG) * Foca! * PO mtry * Yooway) > 
ae PIX (my = Yocmay! * Pl¥ Gury = Yoaey! » 
and 
ee PIX ny 2 Yooury! = PE¥ cmt © Yor! 
“@) * “tt 


If we denote Y = as the median of sample 


2 
then it follows from (4.14) that, P[Y < Yor) ! = Bie > Yo mt1) | 


and hence median unbiasedness of Y. 


Example. Bet. No= 15.) enero. Sov that; of = sandy meni 5.. lne 


sample median Y =(Y 03) 4¥ (4) )/25 the population median y - Yocg)? 


O-()-™ 


pelle 
5005 ae 


and using (4.11), 


PIX ¢ay 7008) 


a \ 
glint pla La g z pe 
| | — Mon yar 
oe arent ‘Gal Sia ed t . 
oi 
Nonaye" 


‘eran db PEM jad ie = 


\  RGBan YS geet as ae 


7 
A, a 
a! 
Ve 


or yrs 
i Keaoe . “isag om | 


slqmae iu nether war's ee 


sl cme, bas Tt dag 0 Ba ae gals 


wi De —- a 
a % cosben Se aia gg J : os mae | 


66 


135 
P[Y.., = jae 
[Y03) = Yocgy! 5005’ 
PLS 
P[Y > ga IP i 
[X¢3) 7 Yocay! 5005 
Again using (4.12), 
EES 5 
P[Y yee 
[Zea < Yocsy! 5005 °°? 
735 
P = po eee 
[Yay = Foca)! SO0s 
ea wis 
> 5: Wy cmasanenenens 
Re MiGan 5005 
Therefore, P[Y < y | = PIY >ly ] = eee) . And hence 
‘ (3) = 08) (4) = 08) 5005 ? 
eles Voces = ey egy 
Case 3. N is even and n is odd. 


Let N = 2M and n= 2m+tl, M and m are integers. Let 


the population median be Yo = %o 04) *%o cee 2 and the sample median 


is Carry: For k= mtl, N-ntk = N-m. Using these values and 


proceeding as in Case 1, we find that 


a PLY mi) = Yoon! = PIX cma 2 Yocssy! 
and hence 


P[ mang ] 


Ytmtl) * Yo! (rts 0 


establishes. that Y is median unbiased. 
(m+1) 


Cone | i rie i aS i 
: hi at : " y 1 wt 
i nm " ay 7 : eas 7 me) wu 
tt; ae (ht A)’ gader at ve 
h ; j a — 7 
AL } ane | PN 
ar Li es, 
t Fa ya Craya cay? : Pn - A : x 
Shy : : y 
Y ay a 14 : 
00 ee ap OS 
wit ; | : 
if adn 7 ae ; 
oy 2 f 4 ih : @ - 
PDE 7 (aya ) ayes ; 
heASD Dra . OE : ; , Y : ‘ = ¥ yo | } 
bor ayo” = aya? (eet cry )2 -er0? 


ws ey aah ia, ; 
Ley at <O87S =) BU ee sean 

3 ud Goss av 
° ee ie j } 7o a4 
) (ay Dpon es An 
i ah ue 4 ree & | 

‘ i, 7 vi iD a eon = @ 
» hire ee 


m ny j 


‘OT3uSIot ete «, base) Wy 


wed eli TRS, eae lg 
ie hs i 
nh ‘ 7 ie 1 
ae a ms 
7 ¥I yay ae 
T I - ' 
a. 2 
a ; : 
q 
n ia bh ‘ 
’ ran 2 ye 
f om “) . 
* ipl 7 
: hi 


bok aswisv seat ag 
¥en 

ie i ie 

- : rial 9 a | 

( ae 


ne 
24, AU - me 4 Oe : It 
rd et mae: Me 
‘om 7 4 (renyo™ = : 
‘>. iw AAG 


67 


Case. 4. Mi and) mare both even: 


Let NM =a2 and: a = 2m, Mand: .n are, integers.. Sample 


and population medians are respectively, Y =(Y + 
Pp pectively € ae X try 2 and 


Yo = (Yq cy *%0 cerry! 2 - Proceeding as in Case 2, we find that 


oe PIS m) = Yoan! * PM ary = Yoon! 


~ PY ay = Foret? Pome) > Yoanriy! > 


(4.18) RE Sane ae > Yoau1)! : 
and 
(4.19) Ee et Ss Yom ! es 2 Yocmtt) | 
Now, 
eo ~ Yoru) * Yom) 
€4.520) REY << Yo] = P[Y aru te Aaa ] 
Sanaa Yocuy | 
= PIS ay = Yoon! 
Similarly, 
(4.21) Ply > rial = Lf Fatt Yocmen)! 


Therefore, using (4.18), (4.20) and (4.21), we get 
PIX ¢ Yo] mht hee vas : 


which establishes median unbiasedness of ve 


© 1). 
4 pe 


bie Spal t,, ws 3 ¥ pabdies re aie 
det bet? ow ,° Hee gk eh gebbaswost i 


ai 
- 


At. 
azar 


‘e 


‘aber s eae r Lag . a 


we: ena ‘Thal ="\(i) vy . 


ib : 


ar @ 


* Pepin’ < com FF 8 Cabana 2 eg 248 


84.2 CONFIDENCE INVERVALS FOR QUANTILE IN FINITE POPULATION 


Let t be a fixed integer in the range 1<t<N. Then we 
can consider YO (t) as the (t/) €B quantile of the population Ta 
Tf Gy) = (ty), <> 1 <i<N)/N then we formally define the a 
quantile of finite population as_ sup{y : G. Cy) Sells yO SRL 
Similarly, the sample quantile is also defined. 

Confidence interval for aon in finite population is 
available in Wilks (1962, p. 333). Years later Meyer (1972). and 


Sedransk and Meyer (1978) extensively studied and extended results 


on this. eonfidence interval... For fixed t =t', 


(4.22) P[ Corer 


ey oe) Meee 


So for fixed N,n,t' and y >O, there is a largest k, say k' 


such that 
t' 
4.2 = A 
( 3) tek! Punk! eae iy: 
We shall consider Y as the best lower 100y% confidence limit 


(k') 
for Yor): Except for values N, n, t' and I1-y, which are 
uninterestingly small, such lower confidence limits can be shown to 
exist. Similarly, the best upper 100yY% confidence limit for Yoct'y 
is obtained by choosing the smallest k, say k", such that 
N-nt+k" 


(4.24) ) P 
t=t' 


N Sea, as: 


For the best 100Y% confidence interval for Yocrty? that is the 


simultaneous upper and lower confidence limits, the probability 


68 


“— . HoPtalbvengy ais 2d sik seepp - tigi) ilk “ cen 
oT wda oniish eLhaurred ov Sans wets Pe tA i 
' ait 


beg» O te $ oP ae taj te i 

stremicioh ial at stlicemip siteand-sots 4 
sk a ESKER oNbeLed. cet steal ok: Farr ‘aa ne) 
 BaR (A580) seyall tesgt) mtesl ag 6. BBE eC1W_a 6 
2ttueet Kebmivte! int bethass elena ate Seat vo 


3°52. Bae ie badwepet ns 


simel soneb $3.03 tyVOL ‘igwot ‘Spd ta “ a ay 
one datiiw jf haall i a oot a +s 

os awode od mip entaal Per tpwOL aaa . is | 
C'sy6* vot zimil Saabs aot ee, sl or 


( Jada four 7 Yate \ al 


- 


Rl ei3 ab dass “orsyge x98 tt 


i) 


bbe iaderg andy asagiht 


69 


function involved here is cumbersome. Meyer (1972) has given the 


following expression for simultaneous confidence interval 


EY ey % (ry | 


for Yor)? where 


(4.25) PLY 4) areas at 


ar t-i-1\ ,N-tti ter-l 7 t=i-2\ ,N-ttitl 

" i=o (ea aK tet) : bo Cu x n=r ) 
Pe Nin) DEC ANARAEGN aie Lace. 

& 

A simpler form of this expression is available in Sendransk and Meyer 

(1978). This paper also states that in forming a confidence interval for 

the Ge quantile, the confidence coefficient for population with ties 

is larger than the confidence coefficient for population without ties, 

proof is available in Meyer (1972). In fact the confidence coefficient 

for the em quantile for a population without ties is the lower 


bound for the confidence coefficient for the comparable confidence 


interval for any finite population. 


Confidence intervals in case of stratified sampling 
Let us now consider a stratified population of q strata 
having strata size Ns ae ; N, =N, the population size. 
i=l 


Let the population values in ascending order be: 


EERE GT CT) te oe “o1@l,) uc (6) yaapeaale vay %02(N,) ; 


< 


BRAS 
aa, “oq (ny) 


We have drawn a stratified random sample of size n from this 


population with n= ) Ny> where n, is the number of units selected 


c 
at random from i 2 stratum. Let the sample values be: 


oe anvkg and (2920) soyalt 
‘oe a ee terre 


3: \ 


w Tey cit an se: 
‘oan ime ath Ue: fugng Bee wk + Kee ines 
iv 


=i ' ty | |, owe it 


p y . bd 
oot bas Semutkret ul aidbiteave Al Bp dayrare rte 9 Weta 12 wIO? : 
so? Lerredal sopabtines a gebero? at dusia silty cals Tageg eet A - 5 
eels mitw mdse lugeg 204 Japs 1L7Th3A3 | eae aii? setkaaeup ‘oa 
2643 S900ITW HOlsaTugC 26 2 wnorathaies ot aaa ae silt 


ny 
ye at: ~~, 
ars 
ta; se 


4 
pou 


taeda ti %acs a2aubh too oA? Das? = GOR) toyvet! ob atialieve ob 
tewol oti} et tots Be ates nviratingod = 2 atishied: OB lig 
oagh Haws st anipmee sie TOT dnatet2 isos sonvabiat bo: anal a 
Aint oan ne alt 


or , r pt is < rt 
? aT 4 sed, ee : nal SE > 


tJnrte ee aaidemin cmr(ois ssthaml ok sis 
ie mottalogine 6p = ait i hater rage ad : 


ak 08 bec. \e 


r 
can bebe a ai as te 


‘ 


+ im ne hie pee 


70 


soa gil GAP oleh eS *Ln,) < %5 (1) A eee moyen < 


Sits one re ene ae Me aaa G 
q(1) q(n,) 


Let our interest be on confidence interval for (t/ny quantile of 
the population. It is sufficient to look at the stratum which contains 
.th : th : 
the i order y-value of the population. Let the t order value be in 
th : 
m stratum. Since a sample from each stratum is drawn independently, 
r t 
we shall consider for the (t/N) h quantile, the sample drawn from 


ic 
the m : stratum only. Let t" = t-(N, Fe oN ead oe rOCeedine tue 


m-1 
similar way as in section one of this chapter, we can find the probability 


that the ra order statistic of the Han stratum will be equal to the 


E Ho, : é 
c : order-statistic of the population, i.e., 


t'-1\ /N —t! Ne 
Gel T ie oP [Ye Ae yee y ieee m Jef ) 
m(k) O(t) Om(t") 4 k=1 nk ne 


ke kel eee «sues eK ee 
m m@m 


* 


Ersthe. t ~ order statistic of the population is not in the moh 


2 oe: t 
Stratum, - Duc Lovina : stratum, then 


(4.28) P[Y ] = 0 for. mee. ten. and 


mk) 0(t) o&(t") 


t: CS lace en ant 


£-1 
It appears from (4.27) and (4.28) that results for unstratified 
population can easily be used for stratified population with few 
changes in notations. Sedransk and Meyer (1978) have given results 
for a more general case of population. There they have not imposed 
the restriction (4.26) to the population and established results 


for population with two strata. 


a) a . « am 
va eae he ae Se J ste = 
> ‘ans > ” ayy’ . Can pase | 
y Gap” an’? 2 ts Cop ? vAs > | | | ‘ j i 


io stizceerp 3 Fag) +69 Lovatadas goasbl ines a0 od soot 
misintn fotdevteeerse ott 9A at Oo? ‘Jeobs tine “s 2 ‘ickoatotele 


‘na 


OF 3¢-SULBY ZRPxG et). oT: THI Ino aiinge' seeds He ool pret _xeb20 Ect s 

dsoshdvaehat oawsrb ‘ak usexte one M042) ‘Sigman & mone: users 

b a 
wosl nweah Siques ot cited Hwa ae 20% web twmes) Leda) 

~~ i i- 


13 yakbuegom’d* at MA. ge OK) +S hy. da ‘ibe esate! 


Kiedany Sh Dae das sw. pretgets etiae ae Race Wk es Yow 


a7 o2 Capes ed Bf Yao taesbese Phe oh ee ta staenieds tahi0 my anty’ 
leas oka oA8 “to aes 


A af \ ic al f=! I*, : ae 
e ) \ he) J yh " 0° 7 canal * 


i a a a ita a Ds a s . 1 enn 
i NW AL 


"m edy abo dant Bt tok, totes Sid te ohéahiare mans ul 
‘aie ‘iaioiate Aint ablneanlly 


ban a a arr ij o)= : (79)80" 7 tant * @s ia 5 eae ofa 
5 ; . S 
one Ee woe a ou : 7: > Day 


"aiken sido to83 (85.0) bee (VELA) mor? atesqge, 31 

wah abv sotaiidnay GORRRasIae XOX beew sd YLtens neo aotsatiqoy 7. i. 
Salus: avitty! sved (80H) tava ban danorbet +Rootinton st esgands Z 
_ susiasiacmibcar tna beac 4 ses te 7 


fies 


§4.3 JOINT DISTRIBUTION OF QUANTILES OF A SAMPLE FROM BIVARIATE 
FINITE POPULATION 


De ee es 


Let our bivariate finite population be (X51 °VQ7 02°%02 


See (on? Yon) of size N. For simplicity let us assume that there 
are no ties among x's and as well as among y's. Let ordered x—-values 


be < and ordered y-values be 


S01) eno ev LS SOON) 


< does not necessarily beong to the 


eee "oan Y0Q(i) 


We have drawn a simple random sample of 


Poauyats%o(2) 
same pair as that of =a0e 
size n from this population. Let the sample be (X,5¥5), 


Goel, teie stk As above let us denote the sample ordered X-values and 


Y-values as X SE ERS Ga) and Y < respectively. 


(1) Gb a net 


Our objective is to find the bivariate distribution of sample Gyn ee 
; y th , 
quantile of x and (j/n) quantile of y. Let us assume oes and 


4) be corresponding sample quantiles. Siddiqui (1960) has derived 
the joint distribution of Xeiy2% yy? when the sample was drawn from 
a continuous bivariate distribution. 


Distribution of Kay 2% 5)? for a sample from finite 


population depends upon the nature of pairs of values (%55°Vo4? in 


the population. Analogous to Siddiqui (1960), we shall introduce two 


new variables My and M, where, 


M = number of pairs (X>¥,) in the sample with x < Xa) 


and YE < Lesa M is a random variable which may vary 


from sample to sample. 


m. = number of pairs (x 


0 01° 0%? in the population with x < X55) 


Oi 


and Voi < Yor)’ my is non-random if one considers the 


SHALRATE MOTE SISMAR A TO el 


$ j es 


Pia: 7 
él op Wageamt? “Ae Se bd oekamngan esantt samt ant ” ie 
svad) Gadd - atyads su sol state yet mat ak % os) 


2oalav-x Devebid asf .s'¥ mrbeen. Bei bia 4e bun oe noms 2 


1 


i y i] : =i 5 uy -' ; 
ed aevinv-; bershto Die saint ose a a 
ls - hen {f) 0 lal 


enn 9 Babel aan $3 -Beiob “set “op? pie > «s0* = hay 
2 Uy ; : : ; - arr mar" 1 ' 


PY ron 
ei yes + 4c Si inthe ~a2 +90 ‘ 


nr, a 
bie @a0levy~az Legahes oF quse | sity patra ay 


ee ¥ 
? . - 


PisOt POA) gapt (Ps sete aye Han. or F 9s or ae 
(oxi) s leprae LO: it ittediasech aiacae i or ad ai P 


” 
sé 


SiiaeR ee dud oy Fe ad Epa. ings ‘ne. #08 
heviveb 2at (Opety! Fupi babe | weet d 1 andsyap Sklawe sukhnogesxsa9 a4 
sor awash way alduae if aeAw | re Set Jovi huadtoneth ra ie 
| nokta kita kh . 


sthit? wes ol gape * Ace Sorina) 0 eooebam, ; 


HS tt) ebiiee to Ca kaen tol eelesamrt aay woe abasqod | 
pea weiyin ei sta he oy igjarty. bupiebe as pasteagrenaet °., 


~S 


o30w had oe ean Ul 


72 


finite population as fixed and my varies for different 


values: of +S j-and - t. 


Here, M is a dummy variable which is to be summed out from the joint 


distribution of (M, xX Y,.,) to get our desired joint distribution 


Cok: Gh, 
Se ory ay 
Let (9° Yocey) and (9 (5) °¥Q) be two units of the 


population which lie on the lines and x= 


sola) *0(s) 
respectively. So any one of the following five case may occur. 


Case l. Xo < ave as Yoct) 


Case 2. Xo < XO (5)? Yo > Yoct) 
Case 3. Xo > XO (3)? Yo < Jot) 
Case 4. | De fe Xs)? Yo > YQ (t) 


Case 5. Xo = x1) in which case there is only one point 
(X5»Yo) common to both lines x = XO (s) and 
= Paha : = 
y Yo(t) n such a case, the pair (X52Vo) 


(X95) Yocty? is a measure of a unit in the 


population. 


Let us now find the P[M=n, Rei) = XO (5)? 4) = Yoct)! under 


the above mentioned different cases. 


Case l. If our population satisfies Case 1, then a possible 
distribution of population and sample values is given in Figure l. 


For simplicity of figures, we shall assume that x-values are all 


positive. 


= ; ae 


Mmens20ES 362 estrev: " i 3) ir 
Cee arin) a4 feat 


v 


roto ® st a7? joo beanwe! eda ae itskew sete ik 1a é a 
woRipdlinterh t4iq) betkesb may ss hey” one Rey pe a 
; Lag oa | 

ao- 3h ‘aaton dwe ed. . taken bok ny a Ysa ak 

cate” bite crn” e ¥ eek oeldh oa ib 

W390 Gee Hand: se lt ay wea ait Ror Si ‘tan be 


dag’ ae} hee Ot oak 
: i : 7 4 7 
ot of ggg 2 ge sees 
csin® * oF ) Fea £8 

in 


Inion ‘and Yindpek. heed ue> y oes: fia, cag" mei -_ ee 


b j ay i f bd 4 vr 

ims eaves x et kp doad oF. poms (leg 

— a | } ed ge! , ry 
* (Xen beg: on vte oe Ours nk “yee Y 


ofg nt 2iqu i io ete teni See Cae Pa Mat ms rh ; h 
ee ue 
P ‘pay Liga io Bic “up” (ny 4a ~ 49 bas tt aise a 
adm ingest shecermeesiccrt ist 


oa 


\% 
Sidtaeag n, dely D geil autlataee 


© euaed ot oo at neuer. algeiee be sein 
"i sis stuberse’ Jory gee Jee ww,  aemupee 


73 


x 


Z *0(s) 


N-m,-s-t+2 


h—-M—3— 742 


y= YO(t) 


FIGURE 1 


In the above figure, N+m)~s-t+2 (or nt+M-i-j+2) represents the 


number of pairs (X54 °Vq4) (or (X,5¥,)) in the population (or sample) 
satisfying Xo4 > XO (5) and You? Yo(t) (or xX. > x (3) and 


x ). Similar meanings apply for other numbers of the figure. 


> 
i” %o(t) 
Points marked by © corrspond to the units of population which lead 


us to consider Case l. 


Since our sample is a simple random sample of size n and 


drawn without replacement, therefore, 


(4.29) Pim =m, Xa) = ee = Yorty! 


? ( Dea a ea) Kes 
m i-m-2 j-m-2 ntm-i-j+2 n 


| sf eresgongen Gta xe) ee oa “non oe 

Cobqmes 79) astaeliang aii i” Cty ie 200) eta aot). 07: 
be pgya® Fu: et Rone 13 i ny ad 

e122 od ao chad date 10} 403 6c 


where, 


mt =/40,1,...,m) = min (m5 > i-2, 4-2) 
s4 3" 1,170) 2.09 ri 
CP Se on Ne] 


So (4.29) can be considered as the joint probability function of M, 


Xx and: 1% with mass points at M = 0,1,...,m', 


(i) (j) 
Stl. s.chely I, ard. “FY. 


(Ly. a O(eye 


nS Ee Co=j.j+1,....,N—nti.) Here 
(3) oct) J+J J 


and for the rest of the thesis, we shall assume that for any integers 


pr vand.q, 


0 ad Des OeOL pas. 0r Gy q:1<a0 


Case 2. Proceeding as in Case 1, the configuration of the sample as 
well as the population values that satisfy Case 2 is given in Figure 2. 


Hence the required probability is 


CO) EOS me pe Moe) sO) 


3G ite nan (a 


(ab oat ae ad * eet 


‘ Pie hy 1g hs 
oe ee 


Wr °4 

M. Jo polatarnd vr itbandazy, sédat oft een eh: 
‘eo ” e ae cia ih agietog say che eer 
arab - haath. Mtl 22, eat? or me Mei ibd 
: hie e 

viagogo) gar TO} 2ede suyear Hades oy, jabnds os a: ahi 


te, 


MA 


an) Slgmse wil AB eee $481 sia 
és ets ne aly at See vanlbe shin 


75 


Ntmj)-s-tt1 


n+M-i-j+1 


X 
FIGURE 2 
where 
i. G05. ym' = min (m5> i -2,j-1) 
Sm 1 bi ag Nee 
iS 4 j ry 1 fe ° © 9 N-n+j 
Case 3. The appropriate configuration for the sample as well as for 


the population is given in Figure 3, below. 


: | ‘ IS 8 o7M: \ 


D+ |= ttn 


Bey 


» 
ie) pL : 


*0(s) 


Nm —-s-ttl 
1G ds 3 Fa al 


(Fort)? 


IGE) 
xX 
(9 (5) ?% 0? atl 
FIGURE 3 
The required probability for such a configuration is 
G.31 POM) =m. Xie See 
oy) ean ACen, Chom LOCey. 
5 ile Re Nim eth Paes ) 
« Cy —m-2 ne i-jtl 
where, 
Mi =) Oe ey mT (m5 i-1,j-2) 
s = i,itl,...,N-nti 
ie. PS] eat elsare chy Oe due ns 
Case 4. The configuration for the sample and population corresponding 


to Case 4, is given in Figure 4, below. 


76 


\ 
ee oe ie . / 
(‘Tae ; . os 
j{—~ 


ba ali 


; Bt 
2 rie ican) & deo ta: vetbaong 


likvat “ep! ‘tape™ a ay t ing" i 


| ( - ert 
in NW Seton 


HF 


im ' min By 


FIGURE 4 


The probability for such a configuration is 


(4.32) P.(M=m, X ] 


4 Gy *et)? “Gi “oc) 


CU Gis aa uae 4) 


where 
ir ma Loe amt (m,,i-1, j-1) 
See ae reece. nto 
t = j,jtl,...,N-nt+j 


Case 5. A suitable configuration for Case 5 is given in Figure 5, 


below. 


17, 


pen 


ai? 


Pie 


4 itech gi aim aie cd 


-, t 
ip he nate ati 


Se Xo (s) 


Nim )-s-ttl 
n+M-i-j+1 
bait Yo(t) 


(x4 (s) Yocty? 


FIGURE 5 


The required probability is 


aa ins aerate <9 eutires) cn ae sin RAN GY 
& My\ 78M ) t-m, as) Mm, =s—ttl 
‘ Ke Cae os j-m-1 Cae a , 
where 
ps Oa at a emi ay (m)»i-1,5-L) 
S <= Pigirigee.s Not 
etl i ay fa ot Ad ae 


Therefore, finally, the required probability function for 


ie. 1.S 


Xa % 4) 


78 


oF 


( +6" 


leet wots “ae Co 
? ge Bia 


el ; 4 =, 
| bean | 
GREP ate “ Mer snntil ~ he 


: ’ tnt, ie sith e 
» theite, oe 4 | 


$k whtrR0 vss io sis satan 


A NES if it . > 
re ees Eee Sul Mae helee: 5) S 


79 


CQ 34 = = ; 
(deed) fei ve badly! ie re 
m' 
; wo ae ah Heme a 


k= 1,2,...,5 depending on which case we have at hand and 
SHS ed ht een g NMS. CoS yj; s...N—-n+4. 

Now, if we consider our finite population is a sample from 
a super-population with continuous distribution fumction F(X,Y), then 
our My» O(a) and oct) become random variables. In this 
circumstance (4.34) will be treated as a conditional distribution of 


x given mm, 


Cuweny tate mee Sites 


in such a case, is a sample from a continuous distribution, so the 


Since the finite population, 


marginal distribution of (m) >» ) will be as given in 


*o(s)? 70(t) 
Siddiqui (1960). 


§4.4 PREDICTION OF FINITE POPULATION QUANTILE USING AUXILIARY VARLABLES 


In this section we would like to investigate the possibility of 
using available information on the auxiliary variable, x, to get a 
_ better estimate of population quantile of y in finite population sampl- 
ing. Keeping the above objective in mind, we derived the bivariate dis- 


tribution of quantiles ( as given by (4.34) of the preceeding 


Ae ey 
section. But we could not give a simpler form to this distribution. 
Consequently we were unable to propose or investigate any reasonable 
estimator for the population quantile of y using quantiles of x. 
However, if we assume certain multivarite models, namely, 


Model M or perhaps the more traditional multivariate normal 


distribution model at the back of our realized finite population, 


te 


; - no es 
‘ane * og ‘(ajo 
; .% ae 7h ae ‘es ‘i : 
ir 7 ee ' gyn — 3 a Re 
rr . 
hat bred ae avert ae eb: datcw ao perkinagab er 
4 
‘teks onal, fey Bharat = 
rt al gn ge « wt aotsalieer mse Ent ‘tivo seh Leen ” as a 
wort AY, XY aolooa motowdes \EB masaertad », apy ast ‘oaks tog 


if 
ligt Cu ea4ldcrrey mobnet ote pad qaret- ban eso" tt 


’ a, am ed . { ! ry. t 
20. Héisydrsseth Lepoltsthnoy ea , bapa! at Lew (AEWA) somacum 


noluninqog eétel? es sonke ‘tsyo° rine eo "4 ‘a anees Masia 5% 


mS o@ ,smrardkriekb «& UOUIAL S00 cy sap? aloquse s ar « SREO 5 - - 


_ 


ent ei 5 | 
id cork ao ae CL dw (Sy Oe ge tty moked! inset a 


-<ouet) “ ». 


vi) 
ey 


a.” ie 


CGTGAIEEY PRAT DA Deed -aEtty are ccmitoe 3 esi wo seni 
= 4 ; | -_* m 


j i fi ms a a 
io 2zttdkewog ‘sub teglisquvat or alit biyow on ict vones mt ie aie r 
; | hs: 
S 6g OF 4x fabio ae ales fda my: = SER ay an fan 
v # ¢ we Ae cs wan cial? 

~Lqmee nobinlidti ain ified ea 


% sh Jon 


oir sew er pom 


a o2ett wad oft) y 
me ie 


sate i 


then it appears that we can get a "predictor" of finite population 
quantiles using auxiliary information. 
Here we are going to use notation developed in Chapter 2. 


! 
Het’ i= (Yjo+++s¥) be the N-dimensional random vector giving the 
t 
finite population y = (Yyo+++sIy) - Under Model M,» let the 


E-expectation and €-variance of Y be 
E(Y) = XB and 1509 se cmuis ular 
where X is Nxp matrix, g is pxl vector and 


V = diag odie) a known NxN non-singular positive definite 


matrix. Let sample size v(s) =n>p. Let us partition Y as: 


where i is nxl vector of sampled mits and Y_ is the (N=n) x1 
s 
vector of non-sampled units. Accordingly we can partition X and V 


as follows: 


where X is nxp and Mh Su TAN etc. 
Ss 


The minimum variance unbiased estimate t for 
N BLU 
population total y Ye as given in Theorem 2.8 is 
L 
‘= gty + gtx B 
(4-35) "BLU Hecak ie 2 =" BLU 
where 
a -1 -1 -1 
= ry Ty: 
BaLU ss s x.) s = 


80 


ez Stivks rodosy ablabs jdoushdniiiiagn ane a 


% a - 
ss ad). HW LeScM) Sabine Cie 
'- iit | 


e 


Bees oko fal at aids essen. re 
athickish evtiteoy shlvguie-con 7 hoist *, ee 


26 Y peilGrag: a ted og ‘a * ms aren a bk wah 


¥ ~ 

R \ :@°? iyi j 
te, aie: pint \ 

\ ca CM 

a 4 ’ ~ : 


Ff he Aya 
[x e-), . Sar. eb a bap? cite bes igh 2 ao A cigs ine “ a 
¥ oboe 2% voy seaviy, Zhao oe videkhrnags a ta. a~tmu 


t 
and hy and h, are, vectors, of .the form (1, .:..51) having 


dimensions n and N-n respectively. 

On the other hand, if we assume our super-population 
model is N(X8,V), i.e. Y ~ N(X8,V), then the following theorem 
due to Royall (1976a) gives the maximum likelihood estimator for 


the population total. 


Theorem 4.1. If Y has a N(X8,V) probability distribution in 
which the known diagonal convariance matrix satisfies V2 = Xy, 


Coe C1 yl wun tor some p—vector -¥5.. 1.8% oO; = 5 Veen es sd COT 


when zh nae is observed, the likelihood function for t= &'y is 


proportional to the Nit, is var(ty, 1) } probability density function, 


U 


where tory is given by (4.35), and 


a -1 -1 =-l 
: Si 0 Vata eee X WM 8h : 
(4. 36) var(t,, 1) Aes Vis = _ (Xe : oo) X'h_ 
ss Ss ss s 8s 
O 
The above theorem suggests that, t is the best linear 


BLU 


unbiased estimator under the normal super-population model, and that 


var(t,, ) is its variance under the same model. 


LU 
It is interesting to look more closely at (4.35). The term 


1 ° 1 as e ° ° f 
Lay. is the observed sample total and “2% Paty is a prediction o 


way » the total y,'s for non-sampled units. So, we can consider 
s 


(4. 37) Viaa eke 
s se BLU 


as the predictor of Y . Now if our interest is on prediction of 
s 
finite population quantile, then we can combine ye andtay yaithe 
Ss 


wanvbil ‘Keen de capil 
om) ban 
wills aucesoae 100 susie ebaad. sie <0 e. | 
mecosss snbwor tot sutt feds vata x, sas t pis: 8 7 
+03 02 wt boo.) Law Eh sain 3 mike on eet 


ai 


eS 


qi io 


ot sotsydjadeth whi tdedors , win ® oa TY a 

‘he a. a Ris ms > : 
a ie ay plata Stak xbeeaa weniglmaieaes | |  cwoni oe 
me ye 1 ‘ an helene: a ashe | 


at) a ; SOR oe bandh salt off becrenity, e ee ra 
Malogev? v2 lewsb ei tdadetg i a) a onda > ‘2 saa ha 
be (RE. oP yd awk or i 


ct BRITA) IC he = orm 1 
as ses ae 


a) (2 2 i epee 
i . Y Wb egeae” Caw Geile (ae 


4 ‘ 
vs ; ry i : , ee 
2 
= Oihe t o 
' Re 7 ji 


uientll’ axbu-ails ‘ed ‘ee se een, aes 0 


| fost rb: ame aay ‘at Coa 

masse? (eta) 1 ills wie: del 03 pabteowaicl at at an ty 

at Scolananepienethstant 

4 y ' De nfl - 
Sch balommerand ‘rod “e thined siti. 


82 


predicted value of Y , obtained using y in 8) to get the 
3 s ELD 


predicted population values of y-characters, viz., y= cae Once 
we have the predicted opulation zy at hand, we can easily sort out the 


required predicted quantile for finite population. A predictor of 


Ce/ny quantile of finite population will be the corresponding 
quantile of the above mentioned predicted population. Obviously, the 


predictor suggested above uses auxiliary information through ae 


t 
As before, let Oe) be the (t/N) 2 quantile of the finite 


“aw 


population which we are going to predict by Y as per above 


Ge) 
suggestions. At this stage it is required to investigate properties 


& 


such as §-unbiasedness and €-MSE of Y Let nec Qeer qu ls 


(ole 


th 
and bea be the sample q quantile obtained from x, without 
using the auxiliary information. This oe is the commonly used 


predictor of corresponding population quantile Now if 


x : 
Ce) 
EC) - Ses viaie cor = sale for any sample s cad, then 


we shall have our proposed predictor oe at least as good as the 


predictor ee and if the strict inequality holds for some s, then 


our predictor Yee) will be better than the predictor ae under the 
Model Gur or multivariate normal super-population model. 


To study these properties we need moments or distribution 


of Yee) which at present we are unable to find. If we assume the 


above mentioned multivariate normal super-population, then the 
marginal distribution of Y is also multivariate normal. But the 
Ss 
joint distribution of Y' = (viey!) is multivariate normal with 
Ss 


singular variance-covariance matrix with rank n. Obviously ¥,'s 


are no longer independent as well as identically distributed rather 


q : : ; ‘ 
| i , nm at eh) 


a4 Ju (280s ef tees! rep. sh tied Se ce pidzalog ih tip sai 

¥q Sdsahbiaeg A: (Laie ae tunti at ! a 70? it sai 
gtibis dievnyearcten’ std sit! i aebinleijon = Agu 40 

vty ,ekebotedd «nbitedeaied iaiesl hh ieeaeene wate 

> dies gel, aaa not seat — ree Ege bial ovada baawsgare 

ey siiareis =e 3) oul a wal bp oot ia 

| wrooa mG 2a. é3 ee beat Us xammp'eae 3 coll (a 


Beidtechus siegiteevel og beeper seed hashanelim - i 28% 


») he : - p \ 
_ is a 
“ f ; ey ya . # a re o ; 
poi ABM 7 ana vistas se. belle wine 3 
ri] Al aT x a 


igiela ss wn won? iste de slit. m9 at eel 
bsay Vinddimd> so a4 5 ait ‘wo tiigerroma ceo ot 
at 


pe 
1 wg ete reser anthongeeston rere: 
Meni « bs: fi aT gigs ¥ hiss ner I eet 2 nt = : ' hl 


ae “eH 1s 
ont =e Bove ge. tend’ ta ‘gut RO dubhy whee) boenngorg “are ea 


needa |. Shae t8 ek ts a aan et dolnsa, wis Ap-baa Fe - 
nib>. soled . osasoprg inte ests nibosed oc Eide “at oibesg 

, Liable ac teilemnoat tome Tews sileteortestua wo, - | 
pare oe igs iio waleseqieg stom vhege oF 


se. ai ini abhi ck = vis wntarartnand, 1 
re eee igo 4 | 


83 


the distribution depends on the colums of X. Exact distribution or 
moments of order statistics for dependent variates have been studied 
by Young (1967), Greig (1967) and Afonja (1972). But there they 
considered the parent distribution is either exchangeable or has 

equal correlation among the variates or has non-singular variance- 
covariance matrix. In survey sampling both N and n are usually 
very large. So, asymptotic properties (as N,n +o) may be of some 
interest. Some general results on the asymptotic behaviour of function 
of order statistics with different mixing types of dependence are 
available in Gastwirth and Rubin (1975) and Mehra and Rao (1975). But 
these types of dependence apparently do not correspond to the nature 

of dependence we have in our x - So, for developing useful properties 
of our estimator Yee? further study will be required on the exact 
and asymptotic distribution of order statistics and their functions 
where the sample is drawn from the population with singular variance- 


covariance matrix. 


70 nolaudiwtesh..naxd 


belie sase Svae ESIREIRY. 5 


vaity dpa yee 44S a a sae, ener ated 6 


“ave 


is 
akiet je dibtenstntssh taHide, a smbion2y20rs omen ae tes 


acne las Yeibenia re Se amit ‘0 eigen ons ey 
uiiendy.sen in fee Be dred pinggaae teen ok wane 
vecn to wd yam Cw ~ sgl ch) SRghiegonG. ste errgee, «BR « 
enw ke wityadsd 2 tal ni as aid ae ediuwor: foamoa. aro’ 
ste sonaumegab 19 esate Radka. snort dae sataetanry'> 
i9d CAT OL) oak brie atom. tims Cos UL) | (weet him, da verRND ah 
oti oid |e briqesroo Jon ss a peenpinin Seehbanh sel 
antrusqozg: Lu3 ooy aot aeny tor we Phe wo ax vend am, os ale 
Inaze ody ob bexldper et Litw ybata uta aak ‘ wt v 302 
ehorsaind. i debt bce sidatoage 18b30 in sepia 


=so.n0 Pie elsptate pikes adnate aan hed seni Be i 


i 


_ 
oe: iw, 
Poa’) 
7{ Y 


CHAPTER V 


ASYMPTOTIC RESULTS FOR SAMPLES FROM FINITE POPULATION 


§5.1 INTRODUCTION 


It is a common practice in survey sampling to use the central 
limit theorem for large populations and through this central limit 
theorem we use standard tests for testing hypothesis concerning finite 
population parameters. Rosen (1964) has given a systematic analytic 
basis for asymptotic behavior of our statistics based on sampling 
from finite population. There are also some earlier works in this 
area, namely, Erdos and Renyi (1959) and Hajek (1960). Recent works 
on the asymptotic behavior of order statistics and quantiles of a sample 
from finite population are Jha (1975) and Singh (1980). In this 
chapter we shall state some of these results and then we shall derive 


the asymptotic bivariate distribution of sample quantiles. 


85.2 SOME ASYMPTOTIC RESULTS 


Usually we consider our population, 

T= (Yor? ° °° Von» as a finite set of fixed numbers and the sample 
of size n drawn from this population is 52+ ae When the 
sampling is done without replacement, samp le observations become 
correlated due to sampling. Although this dependency can be ignored 
for sufficiently large population size WN, for samples drawn 
without replacement the conventional limit procedure for independent 
observations as n-+>° does not have any meaning. The population 


will be-exhausted after a finite number of drawings. So many authors 


eee OU ie 


i. oy < : 
uy ome oe" cy . 

foo S6) sat a9 a ated Vann kw titsong aCe s re ae 
Jiait farswes abds dawotsta: Sti aves gog sored ‘ob eenonds 2 
ay Ser Rakes: sbastiodvsl eas ot ‘phew thee aa ow 


iinkece aljemsiges: s movie snk coe. pe ‘soaieiecan 


aitiqnn ao beeed solzetsade aig ie } ; 
7 ae ‘isha 
ds AE witoy sohtae’ estos onltp ote nore aotoatone ae 


cdtow tganse | Jaen) Soba bas Paepiy Heal ‘bn ecbae-, vlan 
alas ¢ Do sel 24hegp bie wiihtite. neta. to nosended obasamyes aula 
pin 

42 sf (OBR) agate ben COVER) eft exe ectaningog sie» 
AL } F’ 


everab Lfnele aw ao dy bile aaiuiees mene? 7 amor sansa ineeeke 94 83 a af 
ab CEabiee aly % poem anahis mation 


idl <7 


asain an bates oe tara ) 
alenea sith Bae! napa oat 2 she BOLT ws Nogtesaggd = * mf 
“ at os 
og. steal Pee ol ak wokamligg eles eet ermnt 2 ae: he 


craic he dibcnile soe a ‘esi <stnied a mtoe 2h 
Reactant TOR swuiheadeq. aR sere: . 


considered a double sequence of random variables as follows: 


is a random sample from ah 


Nenscg 0 RP AY fi i 
kl? k2? ’ k,n, is a random sample from 7 


They considered the limiting behaviour of statistics based on the 
sequence als of population and assumed that the Tan population 


size N,. ee as Ke oo Let 


k k 


Dpeter ler’ ete? 
Pe Sh phe tree a TA ee Log? 


N N 
2 k j=l 


eee 
ce 


(5.1) 


respectively ka population mean and variance. Let Fy be the ae 


population distribution function obtained by giving weight 1/N,. to 


each element of T° Fie is assumed to be right continuous. The 


centered distribution function Fy) is defined by FE (y) = FL. (y-u,) + 


Let zk) 


ae ig Yi +... +yY,_)/n. Then the following two theorems 


kn 


establish the convergence of sample mean in finite population sampling. 


Theorem 5.1. (Rosen, 1964): Teper) be a sequence of populations. 


=(k 
A sufficient condition for ms te to converge almost surely to 0 


k 


for the sample size sequence (SSS) i, temthate 
af 


wae 
(5y72) lime o C nepeee rere 
k + © bs K 


If {nt} satisfies 
kl 


85 


ieee ie tae | rier 


J 


a mod belies 


Meu an ee 


te Ueaad as ‘byare 36 vost vii sab a" b oreb beans, 


if ke Re aaty! a 8 bank 13 et boo midvveteantl cotta | 


oe wy mae Fee at ty eal meet seavébasenb berosne 


: See i : ‘Th 


2 Lat: 


. ‘ ; PL ui r. | ie 
ot lod es an thst woh to so eeupee otduob: a Bex 


7 - a 
oA ‘oe Ay: 7 7 i? Ona 
\ : ‘ i hed me 7 i 
| : ea Gog : 
; lime? } a 
ee ree NT oe 
? wrt ofamie moban? a, ef PS tie 2 2 
I "7 asi Re deed 2 ae 
fi i } i ce I j ha | a 
» i ; ; a we paul 
; + 9a8 . ri 
ig eae ii ¥- ; i 
: by oo” * - 
, a {* e¢e q ‘3 Pi 
V) é _ : 
r moti olay aoboae & et Yeavu gly r 
‘ = _s ee > gs 
‘ m ; 
{ ; wn 


| mr 
i i 


a L rte ome hee tyeus _ nataedinaed 0: AY nD 


ei ronan a 3H 
om Mis : ’ 


ad 
Ma os ° 
) : 


7] i s 
rawr ae tages bit F te yt {Ls 7 


a4 1 ae], . Sookie bas, seem nokgpiin ca ee cnn oaame 


Le ! } Sioa 


| Rupe $a02 addgits od O° ews ah, Ps, i r ore rl 
re 


uA 


» 


7 
th. 


a tbe “te 


(5.3) lim sup f ydF(y) = 0 


then condition (592) “is necessary for Y “Uy to converge in 
n 


probability to 0 for sSS {n}- 


ky 
0 
Theorem 5.2. (Rosen, 1964). Necessary and sufficient conditions 
—(k 
that a a converges to 0 with probability 1 for every SSS 
(a) with ny +o when k->o is that tnd satisfies, 


G..4) lim sup f ly|dF, (y) = 0 
A+o k |y|>A 


Central limit theorem for finite population has been studied 


86 


by Erdés and Renyi (1959) and Hajek (1960, 1961). The following theorem, 


due to Hajek (1960), gives a necessary and sufficient condition for 


sample total to be asymptotically normally distributed. As mentioned 


in Chapter 1, let U,. = {1,2,....N,} be the label set of the ra 


population (t,) umits. Let s,..be a simple random sample of size 


k k 
th ak) eee 
ny from U..: So that the k sample total n,. ¥ aes Yet has 
iés 
mean and variance equal to no and (CN, =n, ) /Ny nyo = Dy» 


k 
respectively, where Hy and ay are as defined in (5.1). 


Theorem 5.3. (Hajek, 1960). Let Sy be the subset of elements of 


Be on which the inequality 


(5.5) Wipe bres Pp 


a i | Pek 
mi SR SAVIOS OF i a iy av Yrseascon ok anh 


pga | ho eon 


e 3 


= 


wali ihnon seqthetigs tas vs alle _tioet ae 2 morass 
_ i. ars 
die. sare > ae v eb kdadore athe y os “egersvep. = of ; 
i i = - ee ist 
estaatie “i e) sia at ie fae aie AP ee 
ee a ee 
. Mele oe ee a ee 

A ; i | j oe 


‘horturrs abe ‘neha a6 tnt 10% marcos stab Lexie’ 


«af 
| =) 
Hu 


, Maes ht wicker biti’ ett bhaer (0928) ACRE bw (eet) Hemas 
tok -moEss baci rin a bine’ erabenion ® Pad ae 
| be act apron AN | ibwauiliaga te vt Laem ‘Vices ness ad 6) 
my ony To Ps tong Sais wd ia th ye % ae 
sate Jo stipe sane obit cme Fi Te snstine pp 


cs Fis scape “y sa sa on jad me 
a” a! ‘a 


87 


holds, where D. is the variance of pt sample total, mekiny 


Suppose ny, +o and QW -n,) > ©, 
(k) 


Then the random variable a, has asymptotically normal 


distribution with parameter (nu, »D,) if and only if 


2 
2 BS eee 
ha 
(5.6) lin. ———————————_ = 0, forvanye: we 0. 
N 
k > ik 
} Coen ie 
fee ctor 


Estimation of quantiles is usually considered with hardly any 
restriction concerning the distribution. In the first situation an 
efficient estimator for the unknown quantile can be derived from the 
efficient estimator of the unknown parameter. In the second case, 
the natural estimator, namely the sample quantile, cannot be beaten, 
Reiss (1980). We shall now discuss asymptotic behaviour of sample 
quantiles. For this, we need the concept of empirical distribution 


Y from T, 


function G(t,n) corresponding to the sample Yipeces - 


which is defined as 


il Th 
(5.7) eterna eet Se 


EGe-Y oF, Gy oa es 
j al 


i 


where I(:) is indicator function with 


aot we< 0 


tl 
co) 


(5.8) I(u) 


If no complexity arises then we shall use G(t) for G(t,n). 


is ar SP aan SPE ay leone ts 


| . he gogo hn ge — 
Losin, Mee [isk sosqenee ea tae ‘pbMapoerr nolwnen aia: 86) Paci, 


aot he eae 
t pi a ; " 
it wlsd baa tl (As sig ngs Searnsiy* At kee aioike 


4 \ 
ie wat 
; ys r cA YO + ; : 2 An eee ee . a. 


t i" : i 7 7 , 7 4 
‘lbsen (ity bszakiwqa wilanay of isu rere ar | 


frotsaatte> seal eds. HT 


ais mor? bey ish od, ako ehlagayp Teste saad to? aeteintoes 


Laas 


3887 bucsed SH2 ol .1esempteg pwabliy, wid "he meenswet 
aoteed ad Jorma ol toneie® eae rrr yhiomae ‘sesamiae TE78e 


alighne 7b mogtvad s d atiob se tan: adionth: al thse « ioe 7 
, sha ; *h ae oe 
antrodiaaeth est: yi y TS. io aes Baeg ” ate ee a _ h _— 


ia 


: . ) F Moca ae »23 5067: 
Te a rae slunite edt) 93) gah 1 — dni3)8 0 . 


Leto 0? <4 pi.stix" Thea® the aa quantile of a distribution 
function F(t) is defined as supremum over the t-values for which 
F(t) <p. Analogously, we define the empirical wae quantile 
corresponding to a sample of size n from wm as the surpremum over 
the t-values for which G(t,n) < p. The following enearan, due to Rosen 


h 


(1964), gives the asymptotic behaviour of empirical po quantile. 


Theorem 5.4. Let Y@,n,) be the empirical Ao quantile ina 


sample of size ny froma: TF k = 1,2,... .. We assume that there is 


k? 


a continuous distribution function F(t) such that 


(5.9) lim vay sup. |F° (t) - F(t)|» = 0 
k> © t ™ 


and, furthermore, that F(t) is continuous and positive in a vicinity 


of the ae quantile We Ofte CE) cas Now, Lt lim ny = 0 and 


k + 
n 
Lim = <1 
k 
then for every real a, 
F'(n_)Y¥(p,n,)-n 0 y) 
r ia -x /2 
(5210) (i VP pe re RS Pera ce | out se tdicmeae 
Ss is 1 00 
es /p(1-p) (= - =) 


oy ae 


This work of Rosen was later extended by Singh (1980). Singh 
(1980) has shown that after proper normalization, the weak limit of the 
process g(t) is ee where W° is a Brownian bridge on D[0,1], 


the space of all right continuous functions on [0,1] having left hand 


88 


limit (for details on the space D[0,1] please see Billingsley (1968)). 


not sudziefats B xe i LD 


ump aq” da fey via ‘sitive ‘aw erry ee ae 
Tora @f5 ee. W were tm sute Tt; vel quay aa oa: euees 


memoedd Ratwo ~ nit ne aya) dot iiw Bos os Sinden 


i ane re 
ieup ; ry Leory amg bf. wed X hid ar os e wnxde: 


mh ver ust 
Tay, mn eS mt Pay 
, : it 
a can a es j 
; nk we 


aanlay-s Siz Seve ona ah ht ver et aye 
Ne / = ta 


at ag o , ” an mht ' 


wird « Yt 


| eae me: 
4 tent shane to wine, 3 saree aiid vows Ce 


a —— iy 


ws, 7 i 


“a 
edt sinue (UT nokiine A. gldpancanth, cwount 309 is 


hy - iv ié 


OG ~~ |) = (ay ° tT} fi or ma “ae P 
i a we a. ; ‘aun 
me a iat 4 
svial Haq brs eroHAl? aa at. ity% ands omental, «bas 
a Te en soy a. es 
ft art f | it yan a) , q ‘— & 4 Ads 20 e 
ke wa 


a 
¥ . 7 sf ® 
‘ + t 


= 
Ls 


SMHGRe BW. «see Spl = 4 die ‘aatt. a sabe to “es 


Fs] \! Ay : 25 i) 
: et a Sent : 7 ay _ i ss y ¢ ~~ a i 
yo! ab eee a 
ff mre oe ae Ph ye 
40) ee Nts ee 


is Lae yet oat ‘ 


89 


SAS) ASYMPTOTIC BIVARIATE DISTRIBUTION OF SAMPLE QUANTILES 


In Chapter 4, we have derived the joint distribution of 
(Xie aye for a sample from a bivariate distribution of finite 
population. There we considered eS and EG) as sample Cues 
quantile of evans and eve quantile of y-values respectively 
and depending on the population configuration, we derived five 
different forms of probability functions for (Kay? rep In this 
section we shall study asymptotic behaviour of those distributions. 

We shall first consider the probability function under Case 1 


of Chapter 4, (relation (4.29) with s replaced by r). Let us 


r-m,-2, t-m.-2 and 


assume that N is so large such that my» 0 0 


Ntm)-r-t+2 are also sufficiently large for applying the following 


approximations: 


My my" m (my-1) age (m)-m+1) 
iaers ( ys oi Ve ae w= Wy ORY), 
m 0 
2 -1 
m {1+ (1 - (1 -=)... a-= 5) 
7 0 0 0 
a m! 
m 
"9 
23 ah : for large my 
Similarly, 
r-m)~2 (r-m,-2) 7 
i-m-2 
t-m)~2 (t-m,-2) 
(5S) ( , ee Tey PT See 
sae Ci-m—2)- 


: 10. mpd iuditaaaltt ry fils sein et ow é =a ak 
ty tes 7 

asta ks 36 notin yabb ora abiv®§ 07} aigees i) 709) it ; T a¢ P 
“ty MG ue 


yy De 


 ivky ah iotigs vate ay od howehtanos si xed? con 1T0@ 


sp 


Peay aoulev=—" to of sembip Buse. ‘bit souler-x 30 1p 
ee Savigtei sw .nonserugh tng aotiatugog as icignthangell . 1B | 


akity ol Hing siggy) 302 alten 9 enteral 40 stro?’ 32 ser ee 

; pawte iy 

Saciiubeaats etovly ke aerere srrep ese vbuse Lede ow ra o| 

* UF per ne v1 

Y saad) yebnu nat sm 1 etthdedatg odz rebLahes gent? Lise Ww rian | 
LNs Pace eh uk viele ie 


ay aT er eS beouboes > Coty ($68) modaaiae) a 
bite iil bad “gf monte ‘theirs agtel ai un Sacks 


‘aA yi 


mur = ay 


- a & 
pie i ae 


?~ 


ntm-i-—j+2 


N+m_-r-t+2 (N+m_-r-t+2) 
wie ) eee 
n+m-i-j+2 itor aera) 


N n 
(5.15) é )e bis 
n Ne 


Case 1 implies that units (9 (Ly °%q? and (5 >Vocty? are both in the 
sample s, and Xo < x0 (x) » YG < Yocty: Probability that these 


two particular units will be in the sample is, 


1 


(5.16) Va-ebD Prit (Xo (727)? (VQ ¢4y)} e. si 

For large N, (5.16) can be approximated by 

(5.17) ON Peace: oy Hic she Pie sy...) e's] 
n2 Or) e560 Oz OCR) 

Let us denote lim r/N= a, lim t/N = 8 and the joint 
N > © N> © 


distribution function of = (Xoy). as’ .N-> ©" by F(X,Y),. which we 
shall assume continuous. So for large N, we can express the right- 


hand side of (5.17) as 


Pr[x, OS x +- Ax, Yo < Ya! ‘ Pr[xp < x Ye <a o < Y 2 + Ay] 
But 
(Seika) Pr[x, <x < XY + Ax, sf sped = 
i Pr[ xu< Ka + hee ‘ins Yel Sg Rl Le Sexe. Ye! RAS 
Ax 
5 oF, y) dx as Ax a0 


ox 


90 


' ; 7 
: a act ae 
ie ee 
i . | yeas es mega 
NS Sitetae 


a 
a 7 ; | pie 6 
ely sel eta) tt ’ pi beam bak dite: ae tate card nape esbiqat is 
aft PRO VIA LOSGOTS ie abe ee ; pee 

J sado¥ ‘aan 0° ‘ mor o* “bn A st eae 


wh. sloada ado vad Lkyw eazau er : 
” Bek he 1 Dube i 
i. 


| . | ney ‘ 
i} ‘ i! 7 7 ry i . ; 
pd 1m sia yl : — 46 ; . i 
pea ta Ca ee ae vn ae ya ee ne Py 
"(3)a° f el ie ed rhs ae | i ee 


ie > . ‘ 


th iy ay he * age 
Crejotene® Fe wea is aie me ¥ 


tah ) a ea 


feta! sis Bios & = We mii ‘iy = Ws wit. Pesem as 
ae tt - oto 
ig 


on Aatae ‘SKKRS o al ae oo lied sae notin 


7 é ; “ : = Lao 
igty sid eesuaxe ies Se ae Lala ro}, oe |, tberGueet ‘3509. vas, 
pea Pea A ; 


91 


Similarly, 


<¥<y, + dy} = “Guy! 


(55.19 Preiscenrx 2. 
) [xy ayes vin ae. 
OB 


dy 
as Aye>10 


So, finally considering Pp ™ =m, X of 


@ p00) se Loe 


finite case equal. to ..P- [m, x Yai dxdy as N+”, we get by 


al 
Mino CD hk) = C5 319)0 sini (4.29) 


n! 


70% Fpl) A vgldady = reac) -w2)? Gade 


my m rm, =2 i-m-2 t-m2 j-m-2 
N N N 
n+m-i-j+2 
Ntmj-r-t+2 .F OF 
x os hi SKY. 
N ox oy 


where partial derivatives are evaluated at (x5 ia) Let.as NV e 73, 


0 


N+m,-r-t+2 
 oegti ei hed 


So that, we can write (5.20) as: 


Poe) Sak ie p 7 
am) Th hp et eae 
# ‘ Veh f ’ i : f i 
ary ie .) a \ My 
' a i i 
hs i ae 
i }. ie 
yA 
: whew 3 ae 
i§ 
7 ‘ 
iA 
J om Lae 
TS 
" Ane mm ae akg 
i \ aq 
ve a if 
a 7 Ey; 
G=-Wa 26 he : 


‘av® ” ay seat ot ie =m, i setlaided' 
vo amy Sw yr f we ‘ihe . i et) 3 o2 Teuips one ® 
Piet ict (er -antng 


»): 


Teel Ve ma ete as 


- 
v _ ry 
iy r 
i 
7 . 
Fr be (— 


iyiher 


7%. aes 7 ll ” a 


ie 


Gl 


m i-m-2 j-m-2 ntm-i-jt2 


Weapon P P 
Pea Leeann ements tee eA AT 
(5.21) Py [m, x,, yg]dxdy m! (i-m-2)! (j-m-2) ! (ntm-i-j+2)! 
oF OF 
ca ee dxdy «. 


Our (5.21) is exactly the same as in (3.1) of Siddiqui-(1960). Similarly 
we can approximate for Case 2, Case 3, Case 4 and Case 5. Hence our 
bivariate distribution of (Keay: Tage for finite population, as 
conforms to the bivariate distribution of (Keay? aye for 


continuous population as derived by Siddiqui (1960). 


rT ekehiee oaks! tupkbbbe to {I ey ne es — oat come 


Loy, apa. vt aes Bite} ous sf ep a weRD, 208 penne sodlhe 
oe 


oF ee iiss iad e at ae 


= , a - . < 
ro} i py? 6: nod ied brse tt aaptinena baa of awrolnoo . a 


4 oe 


Test t 
‘(OGUL) Lapehoye a icuaesth nay amen " a : 


toe 


REFERENCES 


AFONJA, B. (1972). The moments of the maximum of correlated normal 
and t-varitavesan dak. Statist. SO6Ce De, O04, 1251—262. 


BARNARD, G.A. (1971). Discussion of paper by V.P. Godambe and 
MoE Thompson. JR. Statist: Soc. Bo 33,-:276—378. 


BASU, D. (1969). Role of the sufficiency and likelihood principles 
in sample survey theory. Sankhya, A. 31, 441-454. 


BASU, D. (1971). An essay on the logical foundations of survey sampling, 
part one. In V.P. Godambe and D.A. Sprott, Eds., Foundations 
of Statistical Inference. Toronto: Holt, Rinehart and Winston, 
203-242. 


BASU, D. (1978). Relevance of randomization in data analysis (with 
discussion), in Survey Sampling and Measurement, ed. 
N.K. Namboodiri, New York, Academic Press, 267-339. 


BASU, D. (1980). Randomization analysis of experimental data: The 
Fisher randomization test (with discussion). J. Amer. Statist. 
Asso. 75, 575-595. 


BILLINGSLEY, P. (1968). Convergence of Probability Measures. New York: 
Wiley. 


BREWER, K.R.W. (1963). Ratio estimation and finite population: Some 
results deducible from the assumption of an underlying 
stochastic’ process: - Aust..J."Statist. 5599391052 


CASSEL, C.M., SARNDAL, C.E., and WRETMAN, J.H. (1976). Some results on 
generalized regression estimation for finite populations. 
Biometrika, 63, 615-620. 


CASSEL, C.M., SARDNAL, C.E., and WRETMAN, J.H. (1977). Foundations of 
Inference in Survey Sampling. Wiley-Interscience. 


COCHRAN, W.G. (1939). The use of analysis of variance in enumeration 
by sampling. J. Amer. Statist. Ass. 34, 492-510. 


COCHRAN, W.G. (1946). Relative accuracy of systematic and stratified 
random samples for a certain class of populations. Ann. 
Math. Statist. 17, 164-177. 

COCHRAN, W.G. (1977). Sampling Techniques, 3rd ed., New York: Wiley. 

DEMING, W.E. and STEPHAN, F. (1941). On the interpretation of census 
as samples. J. Amer. Statist. Ass. 36, 45-49. 


= Osea 


fi 


Ce be. 


b | : 1 a : i M 
z ‘aif 
‘ : et ce ay ce 
Z 1 vb - Vee Arve F >. ; 7 
: ; i 7 — '  * 
+ iy ' mn | 
1 {ea en 
P} \ ies : - 7 ies 


Lanmrord bavelerran, +6 ROUTED arty, tte asnonoe aft Ste). el 


Cok—daS AL A 968 .deORee Ae, (  -vojarravd bro Ti 


ian eines 2.0 ye seme ba tas ~ReD Ans os 
BEANS C6 8 ae Tene. -gougmody, ee a iy Quad 


7] a « ae 


eelagranivg faaikiantt fats + onal ie wits 70 alsa’ - (ever) G4 
RDM LEN FE A eee Series csi ellie stones td re 


tian se Vatitiva 36) meotabbanot J a pa mjtn: 10 ysaae sh Srv ery A a 
‘pou bhaiibiioe “,sROM . 2907? .A.d bow edad 

pee iy OY ; vs peat << 
meytietth Baie swuinas H ,xloe :aanorg! - weary 


Hdtw) abeyborr alah «lt sei ean Po: senueeion - @tet) UR a 
Oo ¢ tomate 4 os brs sort yon Cerner Gx. (ot awondh x a 
eg raat a: age73 odneh ask epoy a dbethoodanit aM a * h 7 


ir. eb lishieasbedbtabie to ahaginie ions xeebinas | . aee) tf 
jasc .yewA .L ule deavseth Woke) Jeed into heobomr svetelkE |. 
COSRTe. aN + Oe dh 


taitoY wat .estwaetows ht whiners T, Sy eer -tBOOL) ot: 3 
sansa! 


292 :00RDn una esd. Glew sah oat) denen) Ruki ome 
| untclsakic as 30 eit od9 ama gkieabeh astones ea, 
PCOLPER ing = dead we + Fee <samony aks mmdiaese, a 


no etiuasa see: i iad nay ‘isis 52 a valle wey | i 


esrOl Se hs qe ooh te btentsas 
eo rere 


i 


beryl pom + ceeer)' 23 mcs 


4 i 
wh n 
aa ip ea 1 rr 
i By oe fe 

- i 


ah Vy ' wy my te 


94 


ERDOS, P. and RENYI, A. (1959). On central limit theorem for samples 
from a finite population. Publ. Math. Inst. Hung. Acad. 
Sci., 4, 49-61. 


ERICSON, W.A. (1965). Optimum stratified sampling using prior 
information. J. Amer. Statist. Ass. 60, 750-771. 


ERICSON, W.A. (1969a). Subjective Bayesian models in sampling finite 
populattonssevse "Roy. Statists+Soc? Be > 31,:°195-224; 


ERICSON, W.A. (19696). Subjective Bayesian models in sampling finite 
populations: ~Stratification....In.N.L.. Johnson and H. Smith, 


Eds., New Developments in Survey Sampling. New York: 
Wiley-Interscience, 326-357. 


FORMAN, E.K. and BREWER, K.R.W. (1971). The efficient use of supple- 
mentary information in standard sampling procedures. J. Roy. 
Statist. Soc.. B. 33, 391-400. 


FULLER, W.A. (1970). Simple estimators for the mean of skewed popula- 
tions. Tech. Report. Iowa State University. 


GASTWIRTH, J.L. and RUBIN, H. (1975). The behavior of robust estimators 
on dependent data. Ann. Statist. 3, 1070-1100. 


GODAMBE, V.P. (1966). A new approach to sampling from finite popula- 
tionswL.. t2J.\wRoyweeStatist..-Socs Bs+28,-°310—328: 


GODAMBE, V.P. (1968). Bayesian sufficiency in survey-sampling. Ann. 
Math. Statist.) 205 363-373. 


GODAMBE, V.P. and JOSHI, V.M. (1965). Admissibility and Bayers 
estimation in sampling from finite populations, I. Ann. 
Math Statist «,36, 19707=1722: 


GODAMBE, V.P. and THOMPSON, M.E. (1971). Bayes, fiducial and 
frequency aspects of statistical inference in regression 
analysis in survey sampling. J. Roy. Statist. Soc. B. 
38, 361-390. 


GODAMBE, V.P. and THOMPSON, M.E. (1973). Estimation in sampling theory 
with exchangeable prior distributions. Ann. Statist. iN 
1212-1221. 


GREIG, M. (1967). Extremes in a random assembly. Biometrika. 54, 
273-282. 


HACKING, I. (1965). Logic of Statistical Inference. London, Cambridge 
University Press. 


F i i es 

avh¢nan, 302 movogd? stadt! rere ne ACA A Barre bas ia a 
Buk gad shut tale ol  ppatargag onknrd met! any, 
ie Lane. » boris - ‘ 


roteg pitas git f gmtee OREN elokit ag aR. rr 
AOR 00 soe teeing isaaliee wnoganttot an” 


init anéfenes \et)\ abebow aakworet: aidvoey ue (aeRO) ae 
skS-POr LE 8) kek te et JE wseoigalogos - 


hort gahlgene nt chabos nniewgnleepesatves.  (awdeky.A8-, 
viet OO bne, pearrdet: «dy ak. tae saute bale qo g 


a _ P ; ) 4 

: r A i te teh nf 1, f : e 
phy wat ae byei sero” ee 4 , abs ) 
uth gh . yet EW MW i> ; 


vl jtre to Srp nee od? . . Coreg 4 alarms baw as <n a 
yoe .l JRetubenete antkia’ Sy aberte anh “Creel 
aa, a - a ~3alI022 


sluqeg bawsle a, mabe 43.2103 iechbedaes aSgate TRL) iad < i 
| MIsay eva stare nyod  .taoqee <doak hansen ee - 


motives sdeded Ay sedvaded et. , OT OE oH ,HIgUA bos ba os: 
| od ngHOro & outomge + ark aa Jasbrsyab 0 


~siuqog stint? aexa sekticues oo dasexgge wee A. . (aet) 4¥ ® | 0 fe 
CAL oy Moi, 6S 9 REMMAR yw VOR, «Ls honor ie cai a. 


o tte. Arch ee oe BE WwithoLtjua pahanved -uer) q ¥ 
SN sESESERE 8: tadtes’ sdzelty 5 
; i ay 


ote ew een ae eat). AW, op spre (EMO @ 
sHA as av sibrrt mort potiogne oc. a ina 
tN iSarleaete rare ena mee ee 


a bids’ 
bas istoleee: a um if dhe: “ie sh be. “8.7 ssa 
nga ss Ae tt spat Se ete tw Pitenp rves ip a 


22 sa a 


Pace 


ee 


le ated 


HA'JEK, J. (1960). Limiting distributions in simple random sampling 
from a finite population. Publ. Math. Inst. Hung. Acad. 
Sed. 5,'05 Odors. 


HA'JEK, J. (1961). Some extension of the Wald-Wolfowitz-Noether 
theorem. Ann. Math. Statist., 32, 506-523. 


HANSEN, M.H. and HURWITZ, W.N. (1943). On the theory of sampling from 
finite populations. Ann. Math. Statist., 14, 333-362. 


HARTLEY, H.O. and RAO, J.N.K. (1968). A new estimation theory for 
sample surveys. Biometrika, 55, 547-557. 


HARTLEY , H.O. and RAO, J.N.K. (1969). A new estimation theory for 
sample surveys, II. In N.L. Johnson and H. Smith, Eds., 


New Developments in Survey Sampling. New York: Wiley- 


Interscience, 147-164. 


HARTLEY, H.O. and SIELKEN, R.L. Jr., (1975). A super-population view- 
point for finite population sampling. Biometrics, 31, 
411-422. 


HOLT, D. (1975). A generalization of balanced sampling. Sankhya, 
3740 Gey. 1 L99=203% 


HORVITZ, D.G. and THOMPSON, D.J. (1952). A generalization of sampling 
without replacement from finite universe. J. Amer. Statist. 


Ass. 47, 663-685. 


JESSEN, R.J. (1942). Statistical Investigation of a sample survey for 


95 


obtaining farm facts. Lowa Agr. Exp. Sta., Res. Bull. No. 304. 


JHA, V.D. (1975). Asymptotic distribution concerning discrete order 
statistics. The Mathematics Eduction, Vol..9, No. 2, 29-32. 


KALBFLEISCH, J.D. and SPROTT, D.A. (1969). Applications of likelihood 
and fiducial probability to sampling finite populations. In 
N.L. Johnson and H. Smith, Eds., New Developments in Survey 
Sampling. New York: Wiley-Interscience, 358-389. 


KEMPTHORNE, O. (1969). Some remarks on statistical inference in finite 


sampling. In N.J. Johnson and H. Smith, Eds. New Developments 


in Survey Sampling. New York: Wiley-Interscience, 6/1-695. 


KOLEHMAINEN, O. (1981). Bayesian models in estimating the total of a 
finite population: Towards a general theory. Scand. J. 
Statist., 6, 2ioo2- 


MADOW, W.G. and MADOW, L.H. (1944). On the theory of systematic 
samplings ean. Math. statist... 15, 01-24. 


i? 4 
— 


i 7 : q F : 
Bk ae OS vo? Lhe. 
“- } 


ed Lagmise: sink ants + staple ‘th mara paronirn ee 
bark galt Heat” Dakteksqog erbukt anor 
| ACH P hee 
a \ a PI AL 
twlsSon—su tow Shi a: AS AO ne ae sain wf hOOLD i 
| HSE-908! eS _ eh 40 


wT 
Pn a 
eae ) 


novi af nif ani a0) W@att Ae. he's uiteaer): re STURM bee BM of 12, iC 
CUEHEE AE 4. Faken sida ‘vane anc tele atten : | 


192 yest) soldentioee wee A aborts «SK, L , OARS hae Gat «3 
ay RARE 22 ~sabasomota ee ee | 


To} lgyeeiia neldemivaw war A U6 aut ig DAR ‘on OCH renal 
+869 Siew .b bre nogedet Se £ a ee i | Die 


* SDA cube? waht ogal ques 


-waiy agtdad Mr atieee 1 at A »t@Peoy . at ed ¥ aesihi bila 0.8 risen te = 
LE ,wobwdagala antl Gene aot Regen saint  ateath Te 
' A whi . oe 2 5 


7 
eels e sgabonion hanerat ad. 3.0 seinitiinlh2 ncxiardad & (ese!) i aioe" ; 


<E05 -@0L Ae ae a nu * x 
dehiguse Fo qclvimy bias & (S32) Ged MORIMORT haw .D.0 SETy “a 
kissed? .J5mh VL sae subhell mowk iaamesal ges Jguedibe 
yi He-COP eva sme 
<07 Vovake ah cite a ie pakagion phan? Lachicnakees8, (5205) ee 
ABE .OR tad. uaa yeahs, A er, niall ere, gad ecm hdall 7 


- a uty r 
ras ‘io acme Ly pitt eso mig. mansions 22 gered sieve) cod Al a 
SS=28" 8, sOn y® Chow ava opener ee aah Py 


banALianel he ‘ae 
ti popes Leo oe, 


be - .4eaer) ee 
a8 ae et _— 


96 


MAHALANOBIS, P.C. (1944). On-large scale sample surveys. Roy. Soc. 
PHid Trans. BUC2318"329=451. 


MEHRA, K.L. and RAO, M.S. (1975). On functions of order statistics 
for mixing processes. Ann. Statist., 3, 874-883. 


MEYER, J.S. (1972). Confidence intervals for quantiles in stratified 
random sampling. Unpublished Ph. D. dissertation, Iowa 
State University Library, Ames, Iowa. 


MUKHOPADHYAY, P. (1977). Robust estimators of finite population total 
under certain linear regression models. Sankhya, 39 C, 
71-87. 


NEYMAN, J. (1971). Diseussion on paper by R.M. Royall. In V.P. Godambe 
and D.A. Sprott, Eds., Foundations of Statistical inference. 
Toronto: Holt, Rinehart and Winston, 276-278. 


RAJ, D. (1958). On the relative accuracy of some sampling techniques. 
J. Amer. Statist. Ass., 53, 98-101. 


RAMAKRISHNAN, M.R. (1970). Optimum estimators and strategies in survey 
sampling. Ph.D. Thesis, Indian Statistical Institute. 


RAO, C.R. (1971). Some aspects of statistical inference in problems 
of sampling from finite populations. In V.P. Godambe and 
D.A. Sprott, Eds., Foundations of Statistical Inference. 
Toronto: Holt, Rinehart and Winston, 197-202. 


RAO, J.N.K. (1975). On the foundations of survey sampling. In 


J.N. Shrivastara, Eds., A Survey of Statistical Design and 
Linear Models. The Hague: North Holland, 489-505. 


RAO, J.N.K. and BELLHOUSE, D.R. (1978). Optimal estimation of a finite 
population mean under generalized random permutation models. 
Carleton Mathematical Series, No. 151. ; 


REISS, R.D. (1980). Estimation of quantiles in certain non-parametric 
models. Ann. Statist. 8, 87-105. 


RINGER, L.R., JENKINS, 0.C. and HARTLEY, H.O. (1972). Roof estimators 
for the mean of skewed distributions. J. Amer. Statist. 
Ass., 68, 414-419. 


ROSE'N, B. (1964). Limit theorems for sampling from finite populations. 
Arkiv for Mathematik., 5, 383-424. 


ROYALL, R.M. (1968). An old approach to finite population sampling 
theory. J. Amer. Statist. Ass., 63, 1269-1279. 


Ti rena BD 


Wag. wos «pees aor 


sa aK 5 de 
| aya wean mien to sions ant ee “ere “aul “oa ro nk 
oe ee Cor toh sspeaaoong goby hm 


aan a eal 
betiioetia nt eolisneup, 702 gies ee ae bonhe sal seh 
inp I pHoksax raRet at a bow gciiquss ae 
nerd « woe pions eames 7 


‘stey toljaingau on ines io eT ad epee aH 
(3. Re: porpriase ae Rais S i attaiall _— 


odpehat TL 0 wf -igavon MR el aya ee GRRL) « 
eee it yen SAS 2 te 2a ae Sines Samet shat ham 
cri a eo egew hike te 


iphones artht gine aii Ue “avtypom itunes late. nO. - geet) so : 
MOLE ES 5 eal ot eRIUI2 roms <i. i Bi 


cave ne weheendrdel fan sxntknbhee amimiago (Oren): Sts TAMER, 
agus 7 any divs ka bawae we tind — AP sone ey x 


a 


wneldeng ut soaewst at ixotdatoene 0 eipaine oned ater) ANS «OAR 
bik welts) oe ont erp witor? wo?) es ta eae 


| “ M ry 
3 1S Py @ \ 
OE Say Arse. al a 


45 7 
= oe 
a 


, 


, chscit » we 00 BRS) te sas i 


v 
ia 7] r 


asinkina te’ rn Zdaat 
.eiabom cit 9 age 


9 LT PamMw TO 115M: upd 


oe 


ROYALL, R.M. (1970a). Finite population sampling on labels in 
estimation. Ann. Math. Statist., 41, 1774-1779. 


ROYALL, R.M. (1970b). On finite population sampling theory under 
certain linear regression models. Biometrika, 57, 377-387. 


ROYALL, R.M. (1971). Linear regression models in finite population 
sampling theory. In V.P. Godambe and D.A. Sprott, Eds., 
Foundation of Statistical Inference. Toronto: Holt, 
Rinehart and Winston, 259-274. 


ROYALL, R.M. (1976a). Likelihood functions in finite population 
sampling theory. Biometrika, 63, 605-614. 


ROYALL, R.M. (1976b). The linear least squares prediction approach 
to two-stage sampling. J. Amer. Statist. Ass., 71, 657-664. 


ROYALL, R.M. and DUMBERLAND, W.G. (1978a). Variance estimation in 
finite population sampling. J. Amer. Statist. Asso., 73, 
351-358. ; 


ROYALL, R.M. and CUMERLAND, W.G. (1978b). An empirical study of 
prediction theory in finite population sampling: Simple 
random sampling and ratio estimator, (with discussion). In 


Survey Sampling and Measurement, ed. N.K. Namboodiri, 
New York: Academic Press, 293-335. 


ROYALL, R.M. and CUMBERLAND, W.G. (1981). An empirical study of the 
ratio estimator and estimators of the variance, (with 
discussions). J. Amer. Statist. Asso., 76, 66-88. 


ROYALL, R.M. and HERSON, J. (1973a). Robust estimation in finite 
populations I. J. Amer. Statist. Ass., 68, 880-889. 


ROYALL, R.M. and HERSON, J. (1973b). Robust estimation in finite 
populations II: Stratification on a size variable. J. 
Amer. Statist. Ass., 68, 890-893. 


SARNDAL, C.E. (1978). Design-based and model-based inference in survey 
Samide we ocand. sl .eocatiaias. J 5).2/—-Jee 


SCOTT, A.J., BREWER, K.R.W. and HO, E.W.H. (1978). Finite population 
sampling and robust estimation. J. Amer. Statist. ASSe 5 
135, BO9—S61. 


SCOTT, A.J. and SMITH, T.M.F. (1969). Estimation in multistage surveys. 
J. Amer. Statist. Ass., 64, 830-840. 


SCOTT, A.J. and SMITH, T.M.F. (1974). Linear super-population models 
in survey sampling. Sankhya, 36. C., 143-146. 


tahoe eects ghiedae awit sleet sted? ot) ie ea ate 
TREATE’ Te Pee ok men chbinastal er alesied i aa \ 


ia bie se Lay ag, eh, Line? xm an on at ‘t 


eR Bergh: ORY tae ais BOD) ; 


Tee et ; 
r Le ast ; 1g int hl axe —... -— 
Pee 8 “AY RED Ae ere Yd ee, of Oe ‘Se Oe 4 x 3 dhl a em 
M 5 : nn i. i an 


am | Jakuqee surat? 0 smal eae bend? a, aes: wh. 3 aida 
F PLe-EOd. 62 ee inact aad ai sunertt - 


ae tt oh ea rena Ts oad ‘u | yoranoerst borahws os 
i a 
na nt htcwdsas sn scent (ab T ER), ae 


ee ee ahi 


ay Fo se ws btdones sot ofdeT OL) uM, ; ' ' 
ide? a ieaee ie Eratiegog:, dhe me Yas ora | at ih 
ot + Ae baees wth Mibw) rcbanzt ee pen besa) wob ms mae 


i tarthoodapiet 220 hse beat aneres y he Lipree Yavaws 
AS whe BatbenA' i krox — 


st 5 hula Gegpaterne walk Aes aul ts me 


fe) ,oonabray one 9: otoumires bo 
Alin gt evegahe unphabeens <TakHA 


estetyi ak PSPLeieer A lonvaky a 
| abi at cay ana: ie vhs 


“laser 


v , 
yi tae eee a 
opatins 0 ae 


SEDRANSK, J. and MEYER, J. (1978). Confidence intervals for the 
quantiles of a finite population: Simple random and 
stratified simple random sampling. J. Roy. Statist. Soc., 
Bag) 40, 8259-2926 


SIDDIQUI, M.M. (1960). Distribution of quantiles in samples from a 
bivariate population. J. Res. Nat. Bu. Standards, B., 
64B, 145-150. 


SINHA, B.K. (1976). On balanced sampling schemes. Calcutta Statist. 
Ass. Bull., 25, 129-138. 


SINGH, K. (1980). A note on sample quantile process of finite 
populations. Austral. J. Statist., 22, 358-363. 


SINGH, P. and GARG, J.N. (1979). On balanced random sampling. 
Sankhya, 41. C., 60-68. 


SMITH, H.F. (1938). An empirical law governing soil heterogeneity. 
JAAS OCias cis k= 2. 


SOLOMAN, H. and ZACKS, S. (1970). Optimal design of sampling from 
finite populations: A critical review and indication of 
mew research areas. J. Amer. Statist. Ass., 65, 653-677. 


TALLIS, G.M. (1978). Note on robust estimation in finite populations. 


Sankhya, 40, C., 136-138. 


THOMSEN, I. (1978). Discussion of Carl-Eric Sarndal's paper. Scand. 
Je -Statist.7,os  4on45. 


WILKS, S.S. (1962). Mathematical Statistics. John Wiley and Sons, 
Inc., New York. 


YOUNG, D.H. (1967). Recurrence relations between the P.D.F.'s of 
order statistics of dependent variables, and some 
applications. Biometrika, 54, 283-292. 


ZACKS, S. (1969). Bayes sequential designs for sampling finite 
populations. J. Amer. Statist. Ass., 64, 1342-1349. 


98 


bare whinis span ook 39 
208) stalin? ~ ea wh atone 


f  bbrsbees? wh eh eee vb latoalygee 


bal i” 


Jabiuv2 sas¢ole) .saemetine onde ade ase Ae aes! he 
MEIKSs Ah Tea Seek. 


Ye wor? ws > D agaanas a reer 4 36: apbedivensa ) scuaery me a th pra ae es 
® 1 oe 


| | | | mii 
eitak? lo weosornd atitadep 
epee tye pp oa’ made 


cgntigike tebe bsanatad WO ie i, sida bi m | sma, 

. Rast, «tt sha ‘9 , We ie 

rkaaugotrtad Lick galytevay wal Lekale sieges + sabes) | ney z 
,ES>I cbt penne yA VL . 


oon paki ques to anleob, temhao Er) it aos Sen .H per i 
io aolvahihet tes waver fob gn irRaotss Loqog ‘otint? . 


HOMES 40 seek .aairet2 ise. E) aneta dsrésse7 wan E 


vi + 


aoolratgoe soteti ant achsanites qaausey os B30K cgerer) M2 enum | 
Ch ereet c.g 8 fe) agg \ i 
7 ul 


basot.1eand! coh are® nite rah Ye Sera ~ corey, 2 aca | 
pmee beri ignite wiiot nese ahi 


te & "ie oa om 
amor Big tien ry 


tet osm S408 ce ‘Gaeny cama 


a 


iano ray 


Rs Dal 


Bh hs 


<7 = 
a 


er. 
a a 


, 4) ; , 


Fey 
= 


De 


Oily 


